Silkopter uses 2 sockets to communicate with the ground station. A TCP reliable socket for the remote control, diagnostics, telemetry etc and an unreliable UDP socket for the video stream. Both go through a Alfa AWUS036H wifi card.
The UAV brain splits video frames in 1Kbyte packets in order to minimize their chances of corruption – and marks each of them with the frame index and timestamp. The ground station receives these packets and tries to put them in order. It waits for up to 100ms for all the packets of a frame to be received before either presenting the frame to the user or discarding it. So it allows some lag to improve video quality but above 100ms it favors real-time-ness over quality.
All the remote control, telemetry and sensor data is sent through TCP and in theory, given a good enough wifi link – this should be ideal. But it’s not, not even close. I consistently get 300-2000ms (yes, 2 full seconds) of lag over this channel.
All of this is due to how TCP treats network congestion. When lots of packets are lost, tcp assumes the network is under a lot of pressure and throttles down its data rate in an attempt to prevent even more packet loss. This is precisely what is causing my lags – the TCP traffic hits the heavy, raw UDP video traffic and thinks the network is congested, so it slows down a lot. It doesn’t realize that I care more about the tcp traffic than the udp one so I end up having smooth video but zero control.
My solution is to create a new reliable protocol over UDP and send control, telemetry and sensor data over this channel, in parallel to the video traffic. In low bandwidth situations I can favor the critical control data over the video one.
There are lots of reliable udp libraries but I always preferred writing my own for simple enough problems when there is the potential of getting something better suited to my needs (not to mention the learning experience).
So my design is this:
- I will have a transport layer that can send messages – binary blobs – and presents some channels where received messages can be accessed.
- Data is split in messages, and each message has the following properties:
- MTU. Big messages can be split in smaller packets. If a packet is lost, it will be resent if needed. This should limit the size of the data to resend.
- Priority. Messages with higher priority are send before those with lower priority.
- Delivery control. If enabled, these messages will send delivery confirmation. If confirmation is nor received within X seconds, message is resent. The deliveries also have priority – just like any other messages.
- Waiting time. If a low priority message waited for too long, it will have its priority bumped. For video, I’ll use a 500ms waiting time to make sure I get some video even in critical situations.
- Channels. Each message will have a channel number so that the receiving end can configure some options per channel. Video will have one channel index, telemetry another and so on. The GS will be able to configure a 100ms waiting period for out-of-order messages on the video channel for example.
My current implementation uses boost::asio and I intend to keep using it.
As soon as I finish the stability pids I’ll move to the new protocol.