Android BLE callback OnWriteCallback stops after few seconds - bluetooth

I am trying to write next packet synchronously based on the OnCharacteristicWrite call back condition to achieve a maximum throughput. But for some reason it stops triggering OnCharacteristicWrite callback at very initial after 1-2 sec of period and it never get called even I resend the packets. It works well if I add the delay per packet but I do not want to add any delay to achieve maximum throughput.
Is there any way I could achieve the maximum throughput without adding any delay?
Also what exactly sending multiple packets per connection interval means (and Is there any way I could achieve it through the peripheral)?

If you use Write Without Response (see https://developer.android.com/reference/android/bluetooth/BluetoothGattCharacteristic.html#setWriteType(int)), you will be able to send multiple packets per connection interval.
Android KitKat unfortunately has broken flow control when you send multiple packets with "Write Without Response". If you try on a newer Android device, it should work properly.
If the writeCharacteristic method returns true, it just means it has passed your packet to the Bluetooth process. You can see the exact logic in the source code at https://android.googlesource.com/platform/frameworks/base/+/fe2bf16a2b287c3c748cd6fa7c14026becfe83ff/core/java/android/bluetooth/BluetoothGatt.java#1081. Basically it returns true if the characteristic has the write property, the gatt object is valid and there is currently no other pending GATT operation going on.
The onCharacteristicWrite callback will send status=0 when the Write Response has arrived (for Write With Response) or the Bluetooth stack is ready and has buffer space to accept a new packet (for Write Without Response).
I recently wrote a post about that here you could read: onCharacteristicWrite and onNotificationSent are being called too fast - how to acquire real outgoing data rates?.
If you want a simple workaround for KitKat you could write 10 packets as Write Without Response and then the 11th packet as Write With Response and then start over with Write Without Responses. That should give you decent performance.

Related

UDP: doubts about write() and socket timeout (SO_SNDTIMEO) when the socket buffer is full

I'm having some problems understanding how socket buffers and timeouts are managed under Linux, when using UDP. I'm using the OpenWrt embedded Linux distribution, with kernel version 4.14.63.
In order to better understand these concepts, I'm trying to analyze the code that is used by a client of the iPerf open source network measurement program, when sending UDP packets to test parameters such as reachable throughput. It is written in C and C++.
In particular, I tried setting an offered traffic value much higher to what the network (and consequently yhe receiver) can deliver, obtaining as expected a certain packet loss.
In this case, thanks to iPerf computing the loop time after the transmission of each packet, using timestamps, I was able to estimate how much time the application took to write each packet to the UDP buffer.
The packets are actually written inside a while() loop, which calls write() on a socket for each of them.
A timeout is also set, once, on the socket by calling:
setsockopt(mSettings->mSock, SOL_SOCKET, SO_SNDTIMEO, (char *)&timeout, sizeof(timeout))
This should set a send timeout when writing to the socket, which is, of course, a blocking one.
When the buffer is full, the write() call is blocking and I can see the loop time increasing a lot; the problem is that I can't really understand for how much time this call blocks the application.
In general, when a write() is blocked, does it unblock just as there is room for a new packet? Or does it wait more (as it seems to happen; as far as I was able to understand, trying to set a "big" UDP buffer value (800 KB, when sending UDP datagram with a 1470 B payload), it seems to wait for around 700 ms, letting the buffer get emptied, by the networking stack that is continuosly sending data, for more than the space a single packet would require)? Why?
The other doubt I have is related to the timeout: I tried making little modifications to the source code in order to log the return value of each write() call and I was able to observe that no errors are ever ecountered, even when setting a timeout of 300 ms or 600 ms, which is less than the 700 ms value observed before.
By logging also the evolution of the buffer (together with the loop time at each packet transmission), thanks to ioctl:
ioctl(mSettings->mSock,TIOCOUTQ,&bufsize);
I was able to observe, however, that setting the timeout to 300 ms or 600 ms actually made the difference, and made the blocking write() wait for around 300 ms, in the first case, or 600 ms, in the second case, as the full buffer is detected.
So, even though no errors are detected, the timeout seems to actually expire; this, however, seems to lead to a correct write operation in all the cases.
Is this possible? Can it happen because the write() blocked the application for enough time to let it completely write the data when the timeout expires?
I'm a bit confused about this.
Thank you very much in advance.

why latency varies in web socket when it's a static connection?

as HTTP creates the connection again and again for each data to be transferred over a network, WEB SOCKETS are static and the connection will be made once initially and will stay until the transmission is done...but if the web sockets are static then why the latency differs for each data packet..???
the latency test app i have created shows me different time lag.. so what is the advantage of web socket being a static connection or if this is a common issue in web sockets ??
Do i need to create a buffer to control the flow of data because the data transmission in the is continous..?
does the latecy increases when data transmission is continous?
There is no overhead to establish a new connection with a statically open web socket (as the connection is already open and established), but when you're making a request half way around the world, networking takes some time so there's latency when you're talking to a server half way around the world.
That's just how networking works.
You get a near immediate response from a server on your own LAN and the further away the server gets (in terms of network topology) the more routers each packet much transit through, the more total delay there is. As you witnessed in your earlier question related to this topic, when you do a tracert from your location to your server location, you saw a LOT of different hops that each packet has to traverse. The time for each one of these hops all adds up and busy routers may also each add a small delay if they aren't instantly processing your packet.
The latency between when you send a packet and get a response is just 2x the packet transit time plus whatever your server takes to respond plus perhaps a tiny little overhead for TCP (since it's a reliable protocol, it needs acknowledgements). You cannot speed up the transit time unless you pick a server that is closer or somehow influence the route the packet takes to a faster route (this is mostly not under your control once you've selected a local ISP to use).
No amount of buffering on your end will decrease the roundtrip time to your server.
In addition, the more hops in the network there are between your client and server, the more variation you may get from one moment to the next in the transit time. Each one of the routers the packet traverses and each one of the links it travels on has their own load, congestion, etc... that varies with time. There will likely be a minimum transit time that you will ever observe (it will never be faster than x), but many things can influence it over time to make it slower than that in some moments. There could even be a case of an ISP taking a router offline for maintenance which puts more load on the other routers handling the traffic or a route between hops going down so a temporary, but slower and longer route is substituted in its place. There are literally hundreds of things that can cause the transit time to vary from moment to moment. In general, it won't vary a lot from one minute to the next, but can easily vary through the day or over longer periods of time.
You haven't said whether this is relevant or not, but when you have poor latency on a given roundtrip or when performance is very important, what you want to do is to minimize the number of roundtrips that you wait for. You can do that a couple of ways:
1. Don't sequence small pieces of data. The slowest way to send lots of data is to send a little bit of data, wait for a response, send a little more data, wait for a response, etc... If you had 100 bytes to send and you sent the data 1 byte at a time waiting for a response each time and your roundtrip time was X, you'd have 100X as your total time to send all the data. Instead, collect up a larger piece of the data and send it all at once. If you send the 100 bytes all at once, you'd probably only have a total delay of X rather than 100X.
2. If you can, send data parallel. As explained above the pattern of send data, wait for response, send more data, wait for response is slow when the roundtrip time is poor. If your data can be tagged such that it stands on its own, then sometimes you can send data in parallel without waiting for prior responses. In the above example, it was very slow to send 1 byte, wait for response, send next byte, wait for response. But, if you send 1 byte, then send next byte, then send next byte and then some times later you process all the response, you get much, much better throughput. Obviously, if you already have 100 bytes of data, you may as well just send that all at once, but if the data is arriving real time, you may want to just send it out as it arrives and not wait for prior responses. Obviously whether you can do this depends entirely upon the data protocol between your client and server.
3. Send bigger pieces of data at a time. If you can, send bigger chunks of data at once. Depending upon your app, it may or may not make sense to actually wait for data to accumulate before sending it, but if you already have 100 bytes of data, then try to send it all at once rather than sending it in smaller pieces.

Linux CAN bus transmission timeout

Scenario
There is a Linux-powered device connected to a CAN bus. The device periodically transmits the CAN message. The nature of the data carried by this message is like measurement rather than command, i.e. only the most recent one is actually valid, and if some messages are lost that is not an issue as long as the latest one was received successfully.
Then the device in question is being disconnected from the CAN bus for some amount of time that is much longer than the interval between subsequent message transmissions. The device logic is still trying to transmit the messages, but since the bus is disconnected the CAN controller is unable to transmit any of them so the messages are being accumulated in the TX queue.
Some time later the CAN bus connection is restored, and all the accumulated messages are being kicked on the bus one by one.
Problem
When the CAN bus connection is restored, undefined amount of outdated messages will be transmitted from the TX queue.
While the CAN bus connection is still not available but TX queue is already full, transmission of some most recent messages (i.e. the only valid messages) will be discarded.
Once the CAN bus connection is restored, there would be short term traffic burst while the TX queue is being flushed. This can alter the Time Triggered Bus Scheduling if one is used (it is in my case).
Question
My application uses SocketCAN driver, so basically the question should be applied to SocketCAN, but other options are considered too if there are any.
I see two possible solutions: define a message transmission timeout (if a message was not transmitted during some predefined amount if time, it will be discarded automatically), or abort transmission of outdated messages manually (though I doubt it is possible at all with socket API).
Since the first option seems to be most real to me, the question is:
How does one define TX timeout for CAN interface under Linux?
Are there other options exist to solve the problems described above, aside from TX timeouts?
My solution for this problem was shutting down and bringing the device up again:
void
clear_device_queue
(void)
{
if (!queue_cleared)
{
const char
*dev = getenv("MOTOR_CAN_DEVICE");
char
cmd[1024];
sprintf(cmd, "sudo ip link set down %s", dev);
system(cmd);
usleep(500000);
sprintf(cmd, "sudo ip link set up %s", dev);
system(cmd);
queue_cleared = true;
}
}
I don't know the internals of SocketCAN, but I think the larger part of the problem should be solved on a more general, logical level.
Before, there is one aspect to clarify:
The question includes tag safety-critical...
If the CAN communication is not relevant to implement a safety function, you can pick any solution you find useful. There may be parts of the second alternative which are useful for you in this case too, but those are not mandatorx.
If the communication is, however used in a safety-relevant context, there must be a concept that takes into account the requirements imposed by IEC 61508 (safety of programmable electronic systems in general) and IEC 61784-x/62280 (safe communcation protocols).
Those standards usually lead to some protocol measures that come in handy with any embedded communication, but especially for the present problem:
Add a sequence counter to the protocol frames.
The receiver shall monitor that it the counter values it sees don't make larger "jumps" than allowed (e.g., if you allow to miss 2 frames along the way, max. counter increment may be +3. CAN bus may redouble a frame, so a counter increment of +0 must be tolerated, too.
The receiver must monitor that every received frame is followed by another within a timeout period. If your CAN connection is lost and recovered in the meantime, it depends if the interruption was longer or within the timeout.
Additionally, the receiver may monitor that a frame doesn't follow the preceding one too early, but if the frames include the right data, this usually isn't necessary.
[...] The nature of the data carried by this message is like measurement rather than command, i.e. only the most recent one is actually valid, and if some messages are lost that is not an issue as long as the latest one was received successfully.
Through CAN, you shall never communicate "commands" in the meaning that every one of them can trigger a change, like "toggle output state" or "increment set value by one unit" because you never know whether the frame reduplication hits you or not.
Besides, you shall never communicate "anything safety-relevant" through a single frame because any frame may be lost or broken by an error. Instead, "commands" shall be transferred (like measurements) as a stream of periodical frames with measurement or set value updates.
Now, in order to get the required availability out of the protocol design, the TX queue shouldn't be long. If you actually feel as you need that queue, it could be that the bus is overloaded, compared to the timing requirements it faces. From my point of view, the TX "queue" shouldn't be longer than one or two frames. Then, the problem of recovering the CAN connection is nearly fixed...

TCP Message framing + recv() [linux]: Good conventions?

I am trying to create a p2p applications on Linux, which I want to run as efficiently as possible.
The issue I have is with managing packets. As we know, there may be more than one packet in the recv() buffer at any time, so there is a need to have some kind of message framing system to make sure that multiple packets are not treated as one big packet.
So at the moment my packet structure is:
(u16int Packet Length):(Packet Data)
Which requires two calls to recv(); one to get the packet size, and one to get the packet.
There are two main problems with this:
1. A malicious peer could send a packet with a size header of
something large, but not send any more data. The application will
hang on the second recv(), waiting for data that will never come.
2. Assuming that calling Recv() has a noticeable performance penalty
(I actually have no idea, correct me if I am wrong) calling Recv() twice
will slow the program down.
What is the best way to structure packets/Recieving system for both the best efficiency and stability? How do other applications do it? What do you recommend?
Thankyou in advance.
I think your "framing" of messages within a TCP stream is right on.
You could consider putting a "magic cookie" in front of each frame (e.g. write the 32-bit int "0xdeadbeef" at the top of each frame header in addition to the packet length) such that it becomes obvious that your are reading a frame header on the first of each recv() pairs. It the magic integer isn't present at the start of the message, you have gotten out of sync and need to tear the connection down.
Multiple recv() calls will not likely be a performance hit. As a matter of fact, because TCP messages can get segmented, coalesced, and stalled in unpredictable ways, you'll likely need to call recv() in a loop until you get all the data you expected. This includes your two byte header as well as for the larger read of the payload bytes. It's entirely possible you call "recv" with a 2 byte buffer to read the "size" of the message, but only get 1 byte back. (Call recv again, and you'll get the subsequent bytes). What I tell the developers on my team - code your network parsers as if it was possible that recv only delivered 1 byte at a time.
You can use non-blocking sockets and the "select" call to avoid hanging. If the data doesn't arrive within a reasonable amount of time (or more data arrives than expected - such that syncing on the next message becomes impossible), you just tear the connection down.
I'm working on a P2P project of my own. Would love to trade notes. Follow up with me offline if you like.
I disagree with the others, TCP is a reliable protocol, so a packet magic header is useless unless you fear that your client code isn't stable or that unsolicited clients connect to your port number.
Create a buffer for each client and use non-blocking sockets and select/poll/epoll/kqueue. If there is data available from a client, read as much as you can, it doesn't matter if you read more "packets". Then check whether you've read enough so the size field is available, if so, check that you've read the whole packet (or more). If so, process the packet. Then if there's more data, you can repeat this procedure. If there is partial packet left, you can move that to the start of your buffer, or use a circular buffer so you don't have to do those memmove-s.
Client timeout can be handled in your select/... loop.
That's what I would use if you're doing something complex with the received packet data. If all you do is to write the results to a file (in bigger chunks) then sendfile/splice yields better peformance. Just read packet length (could be multiple reads) then use multiple calls to sendfile until you've read the whole packet (keep track of how much left to read).
You can use non-blocking calls to recv() (by setting SOCK_NONBLOCK on the socket), and wait for them to become ready for reading data using select() (with a timeout) in a loop.
Then if a file descriptor is in the "waiting for data" state for too long, you can just close the socket.
TCP is a stream-oriented protocol - it doesn't actually have any concept of packets. So, in addition to recieving multiple application-layer packets in one recv() call, you might also recieve only part of an application-layer packet, with the remainder coming in a future recv() call.
This implies that robust reciever behaviour is obtained by receiving as much data as possible at each recv() call, then buffering that data in an application-layer buffer until you have at least one full application-layer packet. This also avoids your two-calls-to-recv() problem.
To always recieve as much data as possible at each recv(), without blocking, you should use non-blocking sockets and call recv() until it returns -1 with errno set to EWOULDBLOCK.
As others said, a leading magic number (OT: man file) is a good (99.999999%) solution to identify datagram boundaries, and timeout (using non-blocking recv()) is good for detecting missing/late packet.
If you count on attackers, you should put a CRC in your packet. If a professional attacker really wants, he/she will figure out - sooner or later - how your CRC works, but it's even harder than create a packet without CRC. (Also, if safety is critical, you will find SSL libs/examples/code on the Net.)

realtime midi input and synchronisation with audio

I have built a standalone app version of a project that until now was just a VST/audiounit. I am providing audio support via rtaudio.
I would like to add MIDI support using rtmidi but it's not clear to me how to synchronise the audio and MIDI parts.
In VST/audiounit land, I am used to MIDI events that have a timestamp indicating their offset in samples from the start of the audio block.
rtmidi provides a delta time in seconds since the previous event, but I am not sure how I should grab those events and how I can work out their time in relation to the current sample in the audio thread.
How do plugin hosts do this?
I can understand how events can be sample accurate on playback, but it's not clear how they could be sample accurate when using realtime input.
rtaudio gives me a callback function. I will run at a low block size (32 samples). I guess I will pass a pointer to an rtmidi instance as the userdata part of the callback and then call midiin->getMessage( &message ); inside the audio callback, but I am not sure if this is thread-sensible.
Many thanks for any tips you can give me
In your case, you don't need to worry about it. Your program should send the MIDI events to the plugin with a timestamp of zero as soon as they arrive. I think you have perhaps misunderstood the idea behind what it means to be "sample accurate".
As #Brad noted in his comment to your question, MIDI is indeed very slow. But that's only part of the problem... when you are working in a block-based environment, incoming MIDI events cannot be processed by the plugin until the start of a block. When computers were slower and block sizes of 512 (or god forbid, >1024) were common, this introduced a non-trivial amount of latency which results in the arrangement not sounding as "tight". Therefore sequencers came up with a clever way to get around this problem. Since the MIDI events are already known ahead of time, these events can be sent to the instrument one block early with an offset in sample frames. The plugin then receives these events at the start of the block, and knows not to start actually processing them until N samples have passed. This is what "sample accurate" means in sequencers.
However, if you are dealing with live input from a keyboard or some sort of other MIDI device, there is no way to "schedule" these events. In fact, by the time you receive them, the clock is already ticking! Therefore these events should just be sent to the plugin at the start of the very next block with an offset of 0. Sequencers such as Ableton Live, which allow a plugin to simultaneously receive both pre-sequenced and live events, simply send any live events with an offset of 0 frames.
Since you are using a very small block size, the worst-case scenario is a latency of .7ms, which isn't too bad at all. In the case of rtmidi, the timestamp does not represent an offset which you need to schedule around, but rather the time which the event was captured. But since you only intend to receive live events (you aren't writing a sequencer, are you?), you can simply pass any incoming MIDI to the plugin right away.

Resources