Recently I've found a problem which is quite new for me and I'd appreciate advice. I'm doing serial communication on Linux using termios functions. I actually don't use real serial port, but virtual gadget serial driver /dev/ttyGS0. File descriptor is opened as non-blocking.
My program periodically generates data and sends it to /dev/ttyGS0. There is no information if the other end reads it or not. If it does not, some internal fifo fills up and write returns "would block" error. So far so good, I have no problems with that.
Problem is, when I want to close such file descriptor with filled fifo, close functions blocks! Not indefinitely, but for about 10 seconds.
I tried to do tcflush(uart->fd, TCOFLUSH) before closing without any effect.
This is so strange behavior to me and I found no description, that close could block. Is there any way how to avoid this? Or at least decrease this timeout? Where should I look for this timeout? VTIME attribute has also no effect to this.
As Amardeep mentioned, the close() call is handled by the driver. Close itself is always a blocking call, but generally it's a fast one.
So, the answer is that the delay is specific to the virtual gadget driver. I don't have experience with that one to help.
How important is it to close the file? If the delay is a major problem and the file needs to be closed (such as avoiding file descriptor leaks in a long-running process), then the close will probably need to be called in a separate thread. Obviously, the best answer would be one specific to that driver; perhaps research there might yield an answer, such as an ioctl() call that clears the state of the virtual device.
You may need to configure your port's closing_wait parameter. From the setserial manual:
closing_wait delay
Specify the amount of time, in hundredths of a second, that the kernel should wait for data to be transmitted from
the serial port while
closing the port. If "none" is specified, no delay will occur. If "infinite" is specified the kernel will wait
indefinitely for the
buffered data to be transmitted. The default setting is 3000 or 30 seconds of delay. This default is generally
appropriate for most
devices. If too long a delay is selected, then the serial port may hang for a long time if when a serial port which is
not connected, and
has data pending, is closed. If too short a delay is selected, then there is a risk that some of the transmitted data is
output at all.
If the device is extremely slow, like a plotter, the closing_wait may need to be larger.
Check with setserial the parameters for your port:
$ setserial -g -a /dev/ttyS0
/dev/ttyS0, Line 0, UART: 16550A, Port: 0x03f8, IRQ: 4
Baud_base: 115200, close_delay: 50, divisor: 0
closing_wait: 3000
Flags: spd_normal skip_test
In my case, a faulting device was not receiving the last bytes I sent it, and closing the port always took 30 seconds because of this. You can change this timeout with setserial, for example, to 1 second:
$ sudo setserial /dev/ttyS0 closing_wait 100
Of course, you may want to issue this command on startup in your /etc/rc.local or whatever script your distro uses to configure your ports.
I faced the same issue, in my case disabling flowcontrol before closing the device helped. You can do this using following function:
int set_flowcontrol(int fd, int control)
{
struct termios tty;
memset(&tty, 0, sizeof tty);
if (tcgetattr(fd, &tty) != 0)
{
perror("error from tggetattr");
return -1;
}
if(control) tty.c_cflag |= CRTSCTS;
else tty.c_cflag &= ~CRTSCTS;
if (tcsetattr(fd, TCSANOW, &tty) != 0)
{
perror("error setting term attributes");
return -1;
}
return 0;
}
Just call this before closing:
...
rc = set_flowcontrol(fd, 0);
if (rc != 0)
{
perror("error setting flowcontrol: ");
exit(-1);
}
rc = close(fd);
if (rc != 0)
{
perror("error closing fd: ");
exit(-1);
}
...
Related
I'm trying to communicate between two Linux systems via UART.
I want to send large chunks of data. With the specified Baudrate it should take around 5 seconds, but it takes nearly 10 times the expected time.
As I'm sending more than the buffer can handle at once it is send in small parts and I'm draining the buffer in between. If I measure the time needed for the drain and the number of bytes written to the buffer I calculate a Baudrate nearly 10 times lower than the specified Baudrate.
I would expect a slower transmission as the optimal, but not this much.
Did I miss something while setting the UART or while writing? Or is this normal?
The code used for setup:
int bus = open(interface.c_str(), O_RDWR | O_NOCTTY | O_NDELAY); // <- also tryed blocking
if (bus < 0) {
return;
}
struct termios options;
memset (&options, 0, sizeof options);
if(tcgetattr(bus, &options) != 0){
close(bus);
bus = -1;
return;
}
cfsetspeed (&options, B230400);
cfmakeraw(&options); // <- also tried this manually. did not make a difference
if(tcsetattr(bus, TCSANOW, &options) != 0)
{
close(bus);
bus = -1;
return;
}
tcflush(bus, TCIFLUSH);
The code used to send:
int32_t res = write(bus, data, dataLength);
while (res < dataLength){
tcdrain(bus); // <- taking way longer than expected
int32_t r = write(bus, &data[res], dataLength - res);
if(r == 0)
break;
if(r == -1){
break;
}
res += r;
}
B230400
The docs are contradictory. cfsetspeed is documented as requiring a speed_t type, while the note says you need to use one of the "B" constants like "B230400." Have you tried using an actual speed_t type?
In any case, the speed you're supplying is the baud rate, which in this case should get you approximately 23,000 bytes/second, assuming there is no throttling.
The speed is dependent on hardware and link limitations. Also the serial protocol allows pausing the transmission.
FWIW, according to the time and speed you listed, if everything works perfectly, you'll get about 1 MB in 50 seconds. What speed are you actually getting?
Another "also" is the options structure. It's been years since I've had to do any serial I/O, but IIRC, you need to actually set the options that you want and are supported by your hardware, like CTS/RTS, XON/XOFF, etc.
This might be helpful.
As I'm sending more than the buffer can handle at once it is send in small parts and I'm draining the buffer in between.
You have only provided code snippets (rather than a minimal, complete, and verifiable example), so your data size is unknown.
But the Linux kernel buffer size is known. What do you think it is?
(FYI it's 4KB.)
If I measure the time needed for the drain and the number of bytese written to the buffer I calculate a Baudrate nearly 10 times lower than the specified Baudrate.
You're confusing throughput with baudrate.
The maximum throughput (of just payload) of an asynchronous serial link will always be less than the baudrate due to framing overhead per character, which could be two of the ten bits of the frame (assuming 8N1). Since your termios configuration is incomplete, the overhead could actually be three of the eleven bits of the frame (assuming 8N2).
In order to achieve the maximum throughput, the tranmitting UART must saturate the line with frames and never let the line go idle.
The userspace program must be able to supply data fast enough, preferably by one large write() to reduce syscall overhead.
Did I miss something while setting the UART or while writing?
With Linux, you have limited access to the UART hardware.
From userspace your program accesses a serial terminal.
Your program accesses the serial terminal in a sub-optinal manner.
Your termios configuration appears to be incomplete.
It leaves both hardware and software flow-control untouched.
The number of stop bits is untouched.
The Ignore modem control lines and Enable receiver flags are not enabled.
For raw reading, the VMIN and VTIME values are not assigned.
Or is this normal?
There are ways to easily speed up the transfer.
First, your program combines non-blocking mode with non-canonical mode. That's a degenerate combination for receiving, and suboptimal for transmitting.
You have provided no reason for using non-blocking mode, and your program is not written to properly utilize it.
Therefore your program should be revised to use blocking mode instead of non-blocking mode.
Second, the tcdrain() between write() syscalls can introduce idle time on the serial link. Use of blocking mode eliminates the need for this delay tactic between write() syscalls.
In fact with blocking mode only one write() syscall should be needed to transmit the entire dataLength. This would also minimize any idle time introduced on the serial link.
Note that the first write() does not properly check the return value for a possible error condition, which is always possible.
Bottom line: your program would be simpler and throughput would be improved by using blocking I/O.
I've recently noticed a very odd behavior on my system (running on an AT91SAM9G15): Despite the fact I'm reading serial port continuously, TTY driver takes sometimes 1,2s to deliver data from the input queue.
Thing is: I'm not losing any data, it just takes too many calls to read for it to come.
Maybe my code will help to explain the problem.
First off, I set my serial port:
/* 8N1 */
tty.c_cflag = (tty.c_cflag & ~CSIZE) | CS8;
/** Parity bit (none) */
tty.c_cflag &= ~(PARENB | PARODD);
/** Stop bit (1)*/
tty.c_cflag &= ~CSTOPB;
/* Noncanonical mode */
tty.c_lflag = 0;
tty.c_oflag = 0;
tty.c_cc[VMIN] = 0;
tty.c_cc[VTIME] = 0;
Later on, select is called:
s_ret = select(rfid_fd + 1, &set, NULL, NULL, &port_timeval);
So read() can do its magic:
...
if ((rd_ret = read(rfid_fd, &recv_buff[u16_recv_len], (u16_req_len - u16_recv_len))) > 0)
...
Right afterwards, if I keep reading serial port for 15s for example, for several times I can see no data coming and that data, which I know arrived on time (it's timestamped), comes late. Delays in fetching data from input queue may vary from 300ms to 1,5s.
I've tried every kind of setting I could think of. It's tricky now since I don't know if at91 UART drivers aren't delivering data to tty driver or tty driver isn't fetching it? Which is which here?
Any help would be appreciated.
The normal procedure to set port flags is to read the termios structure, save it for later restoring, modify (in a copy of it) the flags you want to change, and do a tcsetattr() call. You have initialised c_lflag = 0; which can have some secondary effects related to your problem.
The next thing you have to consider is reading the documentation about VMIN and VTIME elements. Setting both to 0 makes the driver a non blocking device, so you'll get in a loop trying to read whatever should be in the buffer. But before doing that, think twice that you have two threads competing for putting the characters in the buffer (your process, trying to get it from the buffer and the driver interrupt routine, that tries to put the character just read) without rest. It should be better (and probably here is the problem) to wait for one character to be available, setting VMIN to 1 and VTIME to 0. This makes the driver to awake your process as soon as one character is available, and probably nearer to what you want.
After all this amount of guesses, you haven't post any reproducible code that can be used to check what you say, so this is the most we can do to help you.
I have to use the linux watchdog driver (/dev/watchdog). It works great, I write an character like this:
echo 1 > /dev/watchdog
And the watchdog start and after an about 1 minute, the system reboot.
The question is, how can I change the timeout? I have to change the time interval in the driver?
Please read the Linux documentation. The standard method of changing the timeout from user space is to use an ioctl().
int timeout = 45; /* a time in seconds */
int fd;
fd = open("/dev/watchdog");
ioctl(fd, WDIOC_SETTIMEOUT, &timeout); /* Send time request to the driver. */
Each watchdog device may have an upper (and possibly lower) limit on that the hardware supports, so you can not set the timeout arbitrarily high. So after setting a timeout, it is good to read back the timeout.
ioctl(fd, WDIOC_GETTIMEOUT, &timeout); /* Update timeout with driver value. */
Now, the re-read timeout can be used as a kick frequency.
assert(timeout > 2);
while (1) {
ioctl(fd, WDIOC_KEEPALIVE, 0);
sleep(timeout-2);
}
You can write your own kicking routine in a script/shell command,
while [ 1 ] ; do sleep 1; echo V > /dev/watchdog; done
However, the userspace watchdog program is usually used. This should take care of all the esoteric features. You can nice the user space program to a minimum priority and then the system will reset if user space becomes hung up. BusyBox includes a watchdog applet.
Each watchdog driver has separate module parameters and most include a mechanism to set the timeout; use either the kernel command line or module parameter setting mechanism. However, the infra-structure ioctl timeout is more portable if you do not have specific knowledge of your watchdog hardware. The ioctl is probably more future proof, in that your hardware may change.
Sample user space code is included in the Linux samples directory.
I'm developing on an am335x system with ubuntu and the last kernel released from TI (vendor).
I'm using a virtual tty device (ttyUSB0) for comunicate with a remote device. After about one hour of continuous comunication (cyclic open-transmit-receive-close) I get a strange behaviour of read(). If the UART is opened in blocking mode the read hangs forever (no matter what value I set on VMIN&VTIME). If I open it in non-blocking mode it return -1 for ever (after 1 hour).
Now I'm using select() to check if there is data to be read.
In case I receive a negative result from select, how can I handle the error? What is a good practice? I have to restart the service?
This code is a part of a service that start at boot time(with upstart). When it hangs, if I restart it, it works again. The restart do not have any effect on the device with which I'm communicating. It works properly.
This is a piece of code, just for completeness:
FD_ZERO(&set); /* clear the set */
FD_SET(tty_fileDescriptor, &set); /* add our file descriptor to the set */
timeout.tv_sec = 10;
timeout.tv_usec = 0;
rv = select(tty_fileDescriptor + 1, &set, NULL, NULL, &timeout);
if(rv>0){
letti=read(tty_fileDescriptor,payLoadTMP,300);
}if(rv<0){
perror("select")
//what to do here to re-stablish communication?
}
The perror's output is:
select: Resource temporarily unavailable
this is a grep on dmesg
usb 1-1: cp210x converter now attached to ttyUSB0
any ideas? How to re-stablish connection?
I have a network client which is stuck in recvfrom a server not under my control which, after 24+ hours, is probably never going to respond. The program has processed a great deal of data, so I don't want to kill it; I want it to abandon the current connection and proceed. (It will do so correctly if recvfrom returns EOF or -1.) I have already tried several different programs that purport to be able to disconnect stale TCP channels by forging RSTs (tcpkill, cutter, killcx); none had any effect, the program remained stuck in recvfrom. I have also tried taking the network interface down; again, no effect.
It seems to me that there really should be a way to force a disconnect at the socket-API level without forging network packets. I do not mind horrible hacks, up to and including poking kernel data structures by hand; this is a disaster-recovery situation. Any suggestions?
(For clarity, the TCP channel at issue here is in ESTABLISHED state according to lsof.)
I do not mind horrible hacks
That's all you have to say. I am guessing the tools you tried didn't work because they sniff traffic to get an acceptable ACK number to kill the connection. Without traffic flowing they have no way to get hold of it.
Here are things you can try:
Probe all the sequence numbers
Where those tools failed you can still do it. Make a simple python script and with scapy, for each sequence number send a RST segment with the correct 4-tuple (ports and addresses). There's at most 4 billion (actually fewer assuming a decent window - you can find out the window for free using ss -i).
Make a kernel module to get hold of the socket
Make a kernel module getting a list of TCP sockets: look for sk_nulls_for_each(sk, node, &tcp_hashinfo.ehash[i].chain)
Identify your victim sk
At this point you intimately have access to your socket. So
You can call tcp_reset or tcp_disconnect on it. You won't be able to call tcp_reset directly (since it doesn't have EXPORT_SYMBOL) but you should be able to mimic it: most of the functions it calls are exported
Or you can get the expected ACK number from tcp_sk(sk) and directly forge a RST packet with scapy
Here is function I use to print established sockets - I scrounged bits and pieces from the kernel to make it some time ago:
#include <net/inet_hashtables.h>
#define NIPQUAD(addr) \
((unsigned char *)&addr)[0], \
((unsigned char *)&addr)[1], \
((unsigned char *)&addr)[2], \
((unsigned char *)&addr)[3]
#define NIPQUAD_FMT "%u.%u.%u.%u"
extern struct inet_hashinfo tcp_hashinfo;
/* Decides whether a bucket has any sockets in it. */
static inline bool empty_bucket(int i)
{
return hlist_nulls_empty(&tcp_hashinfo.ehash[i].chain);
}
void print_tcp_socks(void)
{
int i = 0;
struct inet_sock *inet;
/* Walk hash array and lock each if not empty. */
printk("Established ---\n");
for (i = 0; i <= tcp_hashinfo.ehash_mask; i++) {
struct sock *sk;
struct hlist_nulls_node *node;
spinlock_t *lock = inet_ehash_lockp(&tcp_hashinfo, i);
/* Lockless fast path for the common case of empty buckets */
if (empty_bucket(i))
continue;
spin_lock_bh(lock);
sk_nulls_for_each(sk, node, &tcp_hashinfo.ehash[i].chain) {
if (sk->sk_family != PF_INET)
continue;
inet = inet_sk(sk);
printk(NIPQUAD_FMT":%hu ---> " NIPQUAD_FMT
":%hu\n", NIPQUAD(inet->inet_saddr),
ntohs(inet->inet_sport), NIPQUAD(inet->inet_daddr),
ntohs(inet->inet_dport));
}
spin_unlock_bh(lock);
}
}
You should be able to pop this into a simple "Hello World" module and after insmoding it, in dmesg you will see sockets (much like ss or netstat).
I understand that what you want to do it's to automatize the process to make a test. But if you just want to check the correct handling of the recvfrom error, you could attach with the GDB and close the fd with close() call.
Here you could see an example.
Another option is to use scapy for crafting propper RST packets (which is not in your list). This is the way I tested the connections RST in a bridged system (IMHO is the best option), you could also implement a graceful shutdown.
Here an example of the scapy script.