Intro to DTLS

2016-09-24 21:25 by Ian

Intended audience

This post is aimed at technical readers who know what TLS is used for, but may know nothing about its operation. It is also an attempt to explain why DTLS was developed, and how it applies to IoT.

Commonly mis-apprehended words

"Transport" is just a channel to move bytes. It need not involve two different programs, but does imply a sender and a receiver.

"Security" does not imply "protection against malice". And many security problems involving malice are orthogonal concerns. Consider these security devices...

Padlock Tamper Seal Ratchet Strap Autograph This thing Security against....
X Theft
X Modification
X Momentum and/or gravity
X Impersonation
X Eavesdropping

"Server" is the passive party in the conversation. This does not imply accessibility by more than one client. It also doesn't speak to its size or bandwidth-capabilities. It doesn't even imply that it has information for a client to access. All of that is noise, and until dispelled, will complicate our reasoning needlessly.

The only thing that "server" implies is a program that is alert while idle, waiting for the outside world to impinge upon it.

Conversely, "Client" is the party that initiates the connection. There exist programs that are simultaneously client and server. We will be dealing with many such programs.

What and why?

DTLS ("Datagram Transport Layer Security") is a means of securing a datagram-oriented transport. This is useful in situations where your available transport does not provide all of the following things:

On the internet, this means UDP. But it might equally well apply to packet radio (like BLE or APRS), SMS text messages, or QR codes printed and posted in public places.

These transports are all connectionless.

Counter-examples (TCP-like transports) would be websockets (built on TCP), character streams from serial ports, POSIX pipes, or file I/O as most of us understand it.

(D)TLS is a protocol. And all protocols must respect the constraints of the transports over which they operate.

TLS (with no "D") expects that the bytes it is fed from the outside world are in-order with no gaps or additions, and that they are coming from only one source (it needs a connection). If those criteria are not met, TLS will fail to operate. As it turns out, these standards are far too high for many communication channels to meet.

DTLS includes features that TCP normally provides, but UDP does not (sequence numbers, checksums), and the machinery to handle those features for the sake of constructing an artifice of "session" on top of a transport that does not natively support this concept, and/or isn't reliable.

Ok.... so why would you want to use UDP in the first place if you are going to rebuild TCP on top of it anyway?

Valid question. There are a few compelling reasons:

  1. The notion of "multicast" doesn't make sense within a "connection". So you would need an out-of-band means of doing discovery.

  2. Keeping a TCP session alive means a constant resting memory load in your transport driver, and probably network activity to prevent timeouts. While you incur these costs in DTLS anyway, at least they are confined to the presentation layer, and therefore under more direct application control. In terms of compute: TCP has higher fixed-costs at runtime.

  3. TCP represents a protocol layer that maintains a concept of "logical" connection for which there exists no physical or cryptographic basis. Because it costs non-zero compute and bandwidth to maintain this illusion of connection, it is therefore itself a target for a unique set of exploits. EG: SYN-flooding. DTLS allows the application layer to make choices about which clients are worth serving.

To sum it up, UDP and TCP are not simply "two ways to do the same thing". Beyond moving bytes over IP, they don't do the same things. But since we want to have security no matter what sort of transport we have, we use DTLS when TLS is not feasible.

What is a 'cipher suite'

Cryptography can be used for a number of unrelated tasks. Here, we are concerned with three separate security issues:

A cipher suite is a way of specifying which algorithms we want to use for which purposes.

Generally, there are two basic classes of cryptographic algorithm: Symmetric and asymmetric. See this post for deeper clarification. But it is sufficient for our purposes here to know that asymmetric algorithms are typically used to establish authenticity and identity because the asymmetry of public and private keys creates a natural one-way relationship between signature and identity. But the very thing that makes them "NP-hard" also makes them horrendously inefficient for concealing bulk data.

By contrast, symmetric algorithms are difficult to use for identity, because the key relationship is such that anyone who can understand you can also pretend to be you. But they are typically the most efficient means of reversibly scrambling bulk data.

TLS cipher suites allow us to distill combinations and arrangements of these algorithms into a single 16-bit number. By negotiating about that number at the time the connection is established, server and client can achieve secure communication (whatever that means to us) within the bounds of the system constraints and security tolerances of the participating systems. If there is overlap in their cipher suite support, they can talk.

The IANA defines these cipher suite mappings, and maintains a list to facilitate interoperability of TLS implementations. We will come back to this later...

Reading cipher suite identifiers

The 16-bit cipher suite code is usually apprehended by a person as a string formatted as a C-style constant. Great care has been taken to make this string as descriptive and uniform as possible.


...can be read as...


Key exchange should be done with RSA Diffie-Helman, identity is supported by an RSA signature
AES256 as the bulk encryption algorithm, using GCM block mode
Message payloads are digested by SHA1 before being signed.


...can be read as...

Key exchange should be done with Elliptic curve Diffie-Helman (Ephemeral), identity is supported by an ECDSA signature
no encryption
Message payloads are digested by SHA384 before being signed.

A "null cipher" is a formal way of saying: "Nothing happens". No encryption.

A conversation using this cipher suite would be plainly-understandable to anyone listening. But injecting messages or altering their content would be a god-like act, considering how strongly the messages are signed.

Cipher suite back alleys

If you are somehow involved in IoT or embedded security, the set of TLS cipher suites you are concerned with might be disjoint from those on a web server.

With respect to TLS, the web is fairly homogeneous. Everyone's TLS-enabled web server has essentially the same constraints, the same goals, the same implicit network topology, and the same threat model as other web servers. It doesn't authenticate clients on connection, but is scrupulously careful about identifying itself completely.

Ok... maybe there is a load-balancer in the mix, or a fresh CVE that occasionally prods you to to closely examine your TLS cipher suite support. But there exist TLS cipher suites that are indispensable in mesh or local networks, and have little-to-no use on the internet.

Did you know that TLS allows the server to demand the client identify themselves? TLS may be the only bulwark preventing the world's 34-year-old email systems from collapsing into a gibbering viagra infomercial. Those of you nodding your heads knowingly are in a good position to explain these issues to everyone else. Please: I need your help.

Asymmetric pain

Consider this family of cipher suites (and this one in-particular): TLS_PSK_WITH_AES_128_CCM_8 This family is not specific to DTLS, but DTLS is their most-likely use case.

I've not once seen a web site that demanded I have a pre-shared symmetric key to merely connect, but that is what this cipher suite implies. Pre-shared symmetric keys are not common on the web. But it is common practice in local networks, where a pre-shared key (as you would have for your WiFi AP) is the most direct means to securing a device.

It may also be that the complexity of asymmetric operations is a rough pill to swallow on your platform. These operations tend to become more vulnerable to side-channel attacks as the time required to execute them legitimately grows. Notice that the cipher suite TLS_PSK_WITH_AES_128_CCM_8 has no asymmetric aspect.

A lesser-known fact about symmetric algorithms: you can use some symmetric block modes (AES GCM/CCM) to sign/authenticate as well as encrypt/decrypt. While this doesn't address the Identity question, it potentially allows us to remove an expensive step from our TLS exchanges: asymmetric signatures on post-handshake messages.

If you are using DTLS simply because the protocol you are seeking to protect is connectionless (SIP? DNS?), then you may not care about shaving 3ms off your response time by avoiding signing operations. But if you are trying to cram TLS sessions into a place where 16KiB RAM is considered lavish, you may have no other sane option.

Can you fog a mirror? Good enough.

Here is another TLS cipher suite that you aren't likely to see in your browser: TLS_ECDH_anon_WITH_AES_128_CBC_SHA

Anonymous cipher suites make no externally-referenced claim to identity. There is thus no way to prevent a man-in-the-middle, despite the use of encryption to conceal payloads, and signatures to validate them. The primary value in this cipher suite is to ensure that the system you are hearing from now was the same one you started the conversation with (not typically a concern on TCP/IP), and that the two of you aren't being overheard. But because you can't verify the identity of your counterparty, it could be anyone.

First to feed the lost puppy wins! The prize? A puppy.

This cipher suite is SO uncommon, that it has been the focal-point of the past four months of my interoperability concerns. This will be detailed in a separate post.


Remember when I said we'd come back around to IANA? IANA provides an 8-bit dead-zone within the 16-bit cipher suite identifier space. It calls this dead-zone "private use". Private Use means any implementation can declare custom cipher suites and try new ideas without the risk of running afoul of some other implementation. The OIC1.1 draft's "just works" on-boarding workflow specifies the use of one of these two non-IANA TLS cipher suites:

0xFF00: TLS_ECDH_ANON_WITH_AES_128_CBC_SHA256(0xC018)  <--IANA assigned after OIC1.1 draft.
0xFF01: TLS_ECDH_ANON_WITH_AES_256_CBC_SHA256(0xC019)  <--IANA assigned after OIC1.1 draft.

IoTivity and it's sister-project, iotivity-constrained use a DTLS library (TinyDTLS) that is cipher-suite-restricted. TLS_PSK_WITH_AES_128_CCM_8 is one of the few supported options. Right-of-way is commonly given to the system that has least flexibility, and this is no exception. From the iotivity-constrained README in their TinyDTLS fork...


At the time of this writing, and to my knowledge, only TinyDTLS has implemented this cipher suite in this way, and only the AES128 variant. To contrast, mbedTLS has these cipher suites in overlap...


Therefore, there is presently no way for an mbedTLS stack to execute the OIC1.1 onboarding workflow.


I will leave you with that plate to digest while I do the write-up for my solution-at-present.