I want to make sure the latency between my app and the bluetooth headphones is accounted for, but I have absolutely no idea how I can get this value. The closest thing I found was:
BluetoothLEPreferredConnectionParameters.ConnectionLatency which is only available on Windows 11... Otherwise there isn't much to go on.
Any help would be appreciated.
Thanks,
Peter
It's very difficult to get the exact latency because it is affected by many parameters - but you're on the right track by guessing that the connection parameters are a factor of this equation. I don't have much knowledge on UWP, but I can give you the general parameters that affect the speed/latency, and then you can check their availability in the API or even contact Windows technical team to see if these are supported.
When you make a connection with a remote device, the following factors impact the speed/latency of the connection:-
Connection Interval: this specifies the interval at which the packets are sent during a connection. The lower the value, the higher the speed. The minimum value as per the Bluetooth spec is 7.5ms.
Slave Latency: this is the value you originally mentioned - it specifies the number of packets that can be missed before a connection is considered lost. A value of 0 means that you have the fastest most robust connection.
Connection PHY: this is the modulation on which the packets are sent. If both devices support 2MPHY, then the connection should be quicker.
Data Length/MTU Extension: these are two separate features but I am looping them together becuase the effect is the same - more bytes are sent per packet, which results in a higher throughput. The maximum value is 251 bytes per packet.
You can find more information about these parameters here:-
A Practical Guide to BLE Throughput
Maximizing BLE Throughput: Everything You Need to Know
Bluetooth 5 Speed - How to Achieve Maximum Throughput
And below are some other links that might help you understand what is supported on UWP:-
Bluetooth Developer FAQ
BluetoothLEConnectionParameters.OptimizedProperty
Bluetooth LE Preferred Connection Parameter Class
Bluetooth LE Connection PHY class
How to Change MTU Size and PHY on Windows UWP C++
I'm currently working with termios for serial communication in Linux.
I need to set an intercharacter timeout to 5ms.
I found a way to set intercharacter timeout using VMIN and VTIME where VMIN has to be VMIN > 0 and VTIME > 0.
The problem is that i need to set the VTIME to 5ms, but the VTIME is expressed in tenths of a second.
VTIME data type is unsigned char, so i can't just set it to 0.05.
Does anyone know if there is some way around this?
I need to set an intercharacter timeout to 5ms.
...
Does anyone know if there is some way around this?
No, there is no way to set a shorter termios timeout than 100 ms.
Depending on your hardware and kernel configuration, this timeout may not be reliable at all, especially if you are trying to detect time-separated messages.
The termios handling is at least a full layer above the UART device driver (see
Linux serial drivers).
Unless your kernel is configured to ensure that the bottom-half of the UART driver and the kworker threads for termios are high priority and low latency, then short intercharacter intervals cannot be accurately or reliably determined.
If the UART utilizes a FIFO to buffer incoming data, then that hardware obscures the intercharacter spacing that the software can detect.
Similarly when the UART driver is using DMA to store the received data, intercharacter timing will be obscured.
With DMA the CPU is not involved with handling the received data until the DMA operation is complete, and all temporal information about any intercharacter separation is gone.
(Crucial information such as framing error and/or parity error is difficult/impossible to pinpoint to a specific byte when using DMA.)
Even without DMA, termios will only be able to use timing based on the transfer of data through the tty flip buffers (which is a layer removed from the timing on the wire).
Some UARTs do have hardware that assist in detecting the end-of-message by idle line.
For example Atmel/Microchip ATSAMA5 and AT91SAM9 SoCs have USARTs with a Receiver Timeout feature that measures the idle time after each received frame.
When this idle line time exceeds a specified value, an interrupt can be generated.
The Linux driver for the Atmel USART typically uses the receiver-timeout interrupt to (prematurely) terminate the current DMA receive operation, and copy the contents of the DMA buffer to the tty flip buffer.
In summary you cannot or should not rely solely on VMIN and VTIME settings to detect time-separated messages. See Parsing time-delimited UART data.
The message packets need to have delimiter/sentinel characters/bytes so that messages can be reliably parsed and validated.
See parsing complete messages from serial port for an example of efficient use of syscalls with a local buffer.
When I sent small data (16 bytes and 128 bytes) continuously (use a 100-time loop without any inserted delay), the throughput of TCP_NODELAY setting seems not as good as normal setting. Additionally, TCP-slow-start appeared to affect the transmission in the beginning.
The reason is that I want to control a device from PC via Ethernet. The processing time of this device is around several microseconds, but the huge latency of sending command affected the entire system. Could you share me some ways to solve this problem? Thanks in advance.
Last time, I measured the transfer performance between a Windows-PC and a Linux embedded board. To verify the TCP_NODELAY, I setup a system with two Linux PCs connecting directly with each other, i.e. Linux PC <--> Router <--> Linux PC. The router was only used for two PCs.
The performance without TCP_NODELAY is shown as follows. It is easy to see that the throughput increased significantly when data size >= 64 KB. Additionally, when data size = 16 B, sometimes the received time dropped until 4.2 us. Do you have any idea of this observation?
The performance with TCP_NODELAY seems unchanged, as shown below.
The full code can be found in https://www.dropbox.com/s/bupcd9yws5m5hfs/tcpip_code.zip?dl=0
Please share with me your thinking. Thanks in advance.
I am doing socket programming to transfer a binary file between a Windows 10 PC and a Linux embedded board. The socket library are winsock2.h and sys/socket.h for Windows and Linux, respectively. The binary file is copied to an array in Windows before sending, and the received data are stored in an array in Linux.
Windows: socket_send(sockfd, &SOPF->array[0], n);
Linux: socket_recv(&SOPF->array[0], connfd);
I could receive all data properly. However, it seems to me that the transfer time depends on the size of sending data. When data size is small, the received throughput is quite low, as shown below.
Could you please shown me some documents explaining this problem? Thank you in advance.
To establish a tcp connection, you need a 3-way handshake: SYN, SYN-ACK, ACK. Then the sender will start to send some data. How much depends on the initial congestion window (configurable on linux, don't know on windows). As long as the sender receives timely ACKs, it will continue to send, as long as the receivers advertised window has the space (use socket option SO_RCVBUF to set). Finally, to close the connection also requires a FIN, FIN-ACK, ACK.
So my best guess without more information is that the overhead of setting up and tearing down the TCP connection has a huge affect on the overhead of sending a small number of bytes. Nagle's algorithm (disabled with TCP_NODELAY) shouldn't have much affect as long as the writer is effectively writing quickly. It only prevents sending less than full MSS segements, which should increase transfer efficiency in this case, where the sender is simply sending data as fast as possible. The only effect I can see is that the final less than full MSS segment might need to wait for an ACK, which again would have more impact on the short transfers as compared to the longer transfers.
To illustrate this, I sent one byte using netcat (nc) on my loopback interface (which isn't a physical interface, and hence the bandwidth is "infinite"):
$ nc -l 127.0.0.1 8888 >/dev/null &
[1] 13286
$ head -c 1 /dev/zero | nc 127.0.0.1 8888 >/dev/null
And here is a network capture in wireshark:
It took a total of 237 microseconds to send one byte, which is a measly 4.2KB/second. I think you can guess that if I sent 2 bytes, it would take essentially the same amount of time for an effective rate of 8.2KB/second, a 100% improvement!
The best way to diagnose performance problems in networks is to get a network capture and analyze it.
When you make your test with a significative amount of data, for example your bigger test (512Mib, 536 millions bytes), the following happens.
The data is sent by TCP layer, breaking them in segments of a certain length. Let assume segments of 1460 bytes, so there will be about 367,000 segments.
For every segment transmitted there is a overhead (control and management added data to ensure good transmission): in your setup, there are 20 bytes for TCP, 20 for IP, and 16 for ethernet, for a total of 56 bytes every segment. Please note that this number is the minimum, not accounting the ethernet preamble for example; moreover sometimes IP and TCP overhead can be bigger because optional fields.
Well, 56 bytes for every segment (367,000 segments!) means that when you transmit 512Mib, you also transmit 56*367,000 = 20M bytes on the line. The total number of bytes becomes 536+20 = 556 millions of bytes, or 4.448 millions of bits. If you divide this number of bits by the time elapsed, 4.6 seconds, you get a bitrate of 966 megabits per second, which is higher than what you calculated not taking in account the overhead.
From the above calculus, it seems that your ethernet is a gigabit. It's maximum transfer rate should be 1,000 megabits per second and you are getting really near to it. The rest of the time is due to more overhead we didn't account for, and some latencies that are always present and tend to be cancelled as more data is transferred (but they will never be defeated completely).
I would say that your setup is ok. But this is for big data transfers. As the size of the transfer decreases, the overhead in the data, latencies of the protocol and other nice things get more and more important. For example, if you transmit 16 bytes in 165 microseconds (first of your tests), the result is 0.78 Mbps; if it took 4.2 us, about 40 times less, the bitrate would be about 31 Mbps (40 times bigger). These numbers are lower than expected.
In reality, you don't transmit 16 bytes, you transmit at least 16+56 = 72 bytes, which is 4.5 times more, so the real transfer rate of the link is also bigger. But, you see, transmitting 16 bytes on a TCP/IP link is the same as measuring the flow rate of an empty acqueduct by dropping some tears of water in it: the tears get lost before they reach the other end. This is because TCP/IP and ethernet are designed to carry much more data, with reliability.
Comments and answers in this page point out many of those mechanisms that trade bitrate and reactivity for reliability: the 3-way TCP handshake, the Nagle algorithm, checksums and other overhead, and so on.
Given the design of TCP+IP and ethernet, it is very normal that, for little data, performances are not optimal. From your tests you see that the transfer rate climbs steeply when the data size reaches 64Kbytes. This is not a coincidence.
From a comment you leaved above, it seems that you are looking for a low-latency communication, instead than one with big bandwidth. It is a common mistake to confuse different kind of performances. Moreover, in respect to this, I must say that TCP/IP and ethernet are completely non-deterministic. They are quick, of course, but nobody can say how much because there are too many layers in between. Even in your simple setup, if a single packet get lost or corrupted, you can expect delays of seconds, not microseconds.
If you really want something with low latency, you should use something else, for example a CAN. Its design is exactly what you want: it transmits little data with high speed, low latency, deterministic time (just microseconds after you transmitted a packet, you know if it has been received or not. To be more precise: exactly at the end of the transmission of a packet you know if it reached the destination or not).
TCP sockets typically have a buffer size internally. In many implementations, it will wait a little bit of time before sending a packet to see if it can fill up the remaining space in the buffer before sending. This is called Nagle's algorithm. I assume that the times you report above are not due to overhead in the TCP packet, but due to the fact that the TCP waits for you to queue up more data before actually sending.
Most socket implementations therefore have a parameter or function called something like TcpNoDelay which can be false (default) or true. I would try messing with that and seeing if that affects your throughput. Essentially these flags will enable/disable Nagle's algorithm.
I have a system which uses a UART clocked at 26 Mhz. This is a 16850 UART on a i86 architecture. I have no problems accessing the port. The largest incoming message is about 56 bytes, the largest outgoing about 100. The baud rate divisor needs to be 1 so seterial /dev/ttyS4 baud_base 115200 is OK and opening at 115200. There is no flow control. Specifying part 16850 does NOT set the FIFO to deep. I was losing bytes. All the data is byte, unsigned char.
I wrote a routine that uses ioperm to set the deep FIFO's to 64 and now a read/write works meaning that the deep FIFO's are NOT being enabled by serial_core.c or 8250.c, at least in a deep manner.
With the deep FIFO set using s brute force, post open(fd, "/dev/ttyS4", NO_BLOCKING, etc I get reliably the correct number of bytes but I tend to get the same word missing a bit. Not a byte, a bit.
All this same stuff runs fine under DOS so it is not a hardware issue.
I have opened the port for raw, no delays, no party, 8 bits, 2 stop.
Has anyone seen issues reading serial ports are relatively high speeds with short bursts of data?
Yes, I have tried custom baud, etc. The FIFO levels made the biggest improvement. This is a ISA bus card using IRQ7.
It appears the serial driver for Linux sucks and has way to much latency and far to many features for really basic raw operation.
Has anyone else tried very high speed data without flow control or had similar issues. As I stated, I get the correct number of bytes and all the data is correct except 1 bit in byte 4.
I am pretty stumped.
I have a bit bang code that allows me to send like 4 megs of data through SPI lines. Its embedded code for custom Hardware using a Linux Kernel.
The problem is that takes a VERY long time to do it (4 hours) this is most likely becase the kernel is doing more stuff. Basically my code is something like this(aprox):
unsigned char data=0xFF;
BB_SPI_Init();
SPI_start();//activates chipselect(enable)
for(i=0;i<8;i++){
if(data & 0x80){
gpio_set_value(SPI_MOSI,1);
}else{
gpio_set_value(SPI_MOSI,0);
}
//send pulse clock
gpio_set_value(SPI_CLK,0);
gpio_set_value(SPI_CLK,1);
data<<=1;
}
SPI_stop();//deactivates chipselect(disable)
So is a very simple bit bang, but i notice that if i use write to send data to the linux gpio handler /sys/class/gpio/gpioXX/value (where XX is any gpio number) it takes 4 hours.
But if i use fwrite() for sending to the same device it takes 3 hours.
BUT, if you use write() only for the enables ( SPI_stop(), and SPI_start()) and fwrite() for sending to MISO, CLK it only takes 1 hour and 30 minutes.
So, with that as a base, could someone explain to me how is that happening? my imagination says that is the way the threads are handled and in every software cycle it resolves 2 threads (fwrite() and write()) instead if was only one of the functions used, but now i'm still investigating, can someone let me know any kind of information? is there a better way to handle this?
FYI
Can't use kernel driver spi because the hardware was connected to gpios and it is a mandatory requirement to use bit bang but i accept any suggestion
Thanks in advance
EDIT
Hey Guys thanks for your comments, it seems that i had a problem (very dumb one) that i created the file descriptor each time that i was going to send data to sys/class/gpio/gpioxx/value so that's why was slow. Also turn off some other programs and the transfer skyrocket to 3 minutes instead of 1 hour 30 minutes (with write()). Thanks and Sorry about it
I think that the spi-bitbang driver is the best solution if you are looking for performance. Doing the bit-bang from user space is a pain because you have at least 3 system calls for each bit of data. A system call is an expensive operation.
FYI Can't use kernel driver spi because the hardware was connected to gpios and it is a mandatory requirement to use bit bang but i accept any suggestion
That's why the spi-bitbang driver exists. You can easily configure the spi-bitbang driver to work with your GPIOs.
Then, once you have a spi-bitbang driver, you can write a char device that accept as input your entire block of data and transfer it in kernel space. With this solution you will get the maximum performance for a bit-bang interface.