Fun with the TLS handshake
This article is about the TLS handshake in general, with a focus on how cryptographic signing is used in the protocol. Specifically, the article will attempt do to the following:
See the toytls page at github for the code.
The handshake consists of multiple messages being sent back and forth the server and client, with the ultimate goal of: a) allowing the client to verify the authenticity of the server, usually using cryptographic certificates, b) allowing the server to verify the client and c) to securely agree on a method to proceed with the bulk transfer of data.
The handshake typically look like this:
This means that the client first sends a message ClientHello to the server, and will get the response ServerHello back, etc. The handshake ends with both client and server sending Finished. After that point, a secure channel has been set up for the transfer of data.
The first one, suite TLS_RSA_WITH_RC4_128_SHA (which has id 5), is implemented and supported by almost all clients and servers. The last two are not as common (the servers at Google and Facebook support one each, but not both) and are selected specifically because they have interesting signing behavior, which will be discussed further below.
In English, the three cipher suites above mean:
ourselves with the second suit to implement the "clever hacks" mentioned in the very beginning of this article.
To implement a handshake using TLS_RSA_WITH_RC4_128_SHA, you need, of course, an implementation of RSA, RC4 and SHA1. In addition, TLS version 1.1, which is by far the most common version on the Internet today, requires an implementation of MD5 which is used for authenticating the handshake itself. This is not a requirement for TLS version 1.2, which use SHA-256 instead.
For practical use of RSA, the cipher suites mentioned employ the PKCS1v5 padding schema, which is a method padding a byte-string securely before encrypting it.
In toytls, the cryptographic primitives are implemented in toytls/rsa.py and toytls/rc4.py.
All TLS messages have the TLS header which consists of a content type (first byte), version number (second two bytes) and a length (last two bytes). The content type describes the type of the message, and is 0x16 for the plaintext handshake and 0x17 för encrypted application data.
The handshake messages have a second handshake header describing the type of the handshake message (ClientHello, ServerHello etc). The handshake header also has a length field.
Following the handshake header is payload data specific to the handshake message. For ClientHello, this field contains some random data used for encryption purposes, a list of cipher suites supported by the client, information about compression support and optionally support for TLS extensions, which will not be covered in this article.
It is worth mentioning that, although the ChangeCipherSpec message is sent in the handshake procedure, it is not considered a handshake message and thus has a different content type (0x14) and does not have the handshake header. All the other messages mentioned above are handshake messages with a handshake header and a handshake content type.
you will get alert messages indicating some kind of failure.
In this article we will mostly ignore these messages. Just be aware that if you play around with toytls and change messages you will surely get these messages from the server.
slightly different message contents.
>>> ClientHello
Client says it supports TLS_RSA_WITH_RC4_128_SHA, no compression, and no extensions. It sends some random data used for cryptographic purposes.
<<< ServerHello
Server responds with some random bytes of it's own, also says that it accepts the cipher suite and compression option the client suggested. The server also gives a session ID which can be used if the client wants to reuse the TLS connection later. We will skip over the session ID and sessions in this article.
<<< Certificates
Server sends a list of certificates to the client (for HTTPS, these are the certificates shown in the web browser when the website is opened).
<<< ServerHelloDone
Server sends a message indicating that the client should continue with the handshake.
>>> ClientKeyExchange
The client generates a 48 byte random encryption key called the "pre master secret" and encrypts that with the public key in the server certificate. This encrypted key is sent to the server.
>>> ChangeCipherSpec
The client sends the control message ChangeCipherSpec indicating that everything from now on will be encrypted.
>>> Finished
The client sends an encrypted message that contains a hash of all the messages in the handshake sent and received. This makes it possible to detect tampering. This will be described further below.
<<< ChangeCipherSpec
The server sends the control message ChangeCipherSpec indicating that everything from now on will be encrypted.
<<< Finished
The server responds with a Finished message on it's own, sending an encrypted hash of the handshake messages.
This means that, when the client constructs ClientHello, it will also update the MD5-SHA1-hash with the content of the message. When it receives ServerHello, it will also update the hash, so the client and server can compare hashed in the Finished messages.
In addition to the MD5-SHA1-hash, TLS 1.1 also use a rather peculiar pseudo-random function (PRF) based on the two hashes. This pseudo-random function is used to generate keying material for the cryptographic primitives and MAC:s. The PRF is a combination of HMAC-MD5 and HMAC-SHA1. Please see toytls/hash.py for details.
When the cipher suite is TLS_RSA_WITH_RC4_128_SHA, the PRF is seeded with the pre-master secret, client random and server random data to generate a master secret. This master secret is then seeded to the PRF, again together with the random data, to create keying material. In this case, the keying material is for the RC4 stream cipher and HMAC-SHA1.
Both sides (client/server) hold encryption/MAC contexts for itself and the other side, so the client in our case has an RC4 encryption key for both the client and server.
When the cipher suit sets up a secure channel with Diffie Hellman, this kind of authentication is lacking. Diffie Hellman is anonymous and anyone could impersonate the server. To mitigate this, the TLS cipher suites using Diffie Hellman use RSA or some other algorithm for signing the handshake and encryption parameters. In TLS jargon, this is known as "Ephemeral Diffie Hellman", and is what DHE stands for in
the cipher suite names.
To inform the client of this signature, an extra message is introduced in the TLS handshake, known as ServerKeyExchange. This is sent between Certificate and ServerHelloDone. The signature is over the client and server random data, and the encryption parameters.
This means that a client can, using the client random field (as sent in ClientHello and used for encryption purposes), get up to 32 bytes of data signed by the server ceritificate. This is not completely useful on it's own. However, the TLS specification says that the first part of the random fields should be a timestamp.
By using the client random field in the TLS handshake, together with a Diffie Hellman cipher suite, and using the fact that TLS implementations contain a timestamp from the server, you can use Google or Facebook as a trusted third party for timestamping up to 32 bytes of data, sufficient for for example a SHA-256 hash.
In the toytls project there are two scripts, toytls-sign.py and toytls-verify.py.
- Explain how the TLS handshake works, using Python code.
- Highlight how some parts of the protocol allow for signing short amount of client-supplied data by the server certificate.
- ... and abuse the above parts of the protocol, and common implementations of the protocol, to build poor-mans trusted time-stamping.
See the toytls page at github for the code.
An overview of the TLS handshake
TLS, and it's predecessor SSL, is a protocol for secure (encrypted, authenticated) communication over an untrusted network. The "TLS handshake" is the initial part of the protocol, where a client and a server negotiate how (if) to accomplish the goal of secure communication.The handshake consists of multiple messages being sent back and forth the server and client, with the ultimate goal of: a) allowing the client to verify the authenticity of the server, usually using cryptographic certificates, b) allowing the server to verify the client and c) to securely agree on a method to proceed with the bulk transfer of data.
The handshake typically look like this:
>>> ClientHello
<<< ServerHello
<<< Certificates
<<< ServerHelloDone
>>> ClientKeyExchange
>>> ChangeCipherSpec
>>> Finished
<<< ChangeCipherSpec
<<< Finished
This means that the client first sends a message ClientHello to the server, and will get the response ServerHello back, etc. The handshake ends with both client and server sending Finished. After that point, a secure channel has been set up for the transfer of data.
Cryptographic primitives and cipher suites
TLS has the concept of a "cipher suite". This is a predefined list of cryptographic primitives in different classes (encryption, hashing, authentication etc) to use in the handshake and communication. In this article, we will concern ourselves with three cipher suites:
TLS_RSA_WITH_RC4_128_SHA = 0x0005
TLS_DHE_RSA_WITH_AES_128_CBC_SHA = 0x0033
TLS_ECDHE_RSA_WITH_RC4_128_SHA = 0xc011
The first one, suite TLS_RSA_WITH_RC4_128_SHA (which has id 5), is implemented and supported by almost all clients and servers. The last two are not as common (the servers at Google and Facebook support one each, but not both) and are selected specifically because they have interesting signing behavior, which will be discussed further below.
In English, the three cipher suites above mean:
- A cipher suite using RSA for secure key exchange, RC4 for bulk data encryption and SHA1 as a hash.
- A cipher suite using Diffie-Hellman for secure key exhange, RSA signatures for authentication, AES-128 in CBC mode for bulk encryption and SHA1 as hash.
- A cipher suite using Elliptic Curve Diffie-Hellman for secure key exchange, RSA signatures for authentication, RC4 for bulk data encryption and SHA1 as hash.
ourselves with the second suit to implement the "clever hacks" mentioned in the very beginning of this article.
To implement a handshake using TLS_RSA_WITH_RC4_128_SHA, you need, of course, an implementation of RSA, RC4 and SHA1. In addition, TLS version 1.1, which is by far the most common version on the Internet today, requires an implementation of MD5 which is used for authenticating the handshake itself. This is not a requirement for TLS version 1.2, which use SHA-256 instead.
For practical use of RSA, the cipher suites mentioned employ the PKCS1v5 padding schema, which is a method padding a byte-string securely before encrypting it.
In toytls, the cryptographic primitives are implemented in toytls/rsa.py and toytls/rc4.py.
Message structure
To get a general feel for the TLS handshake we will dissect a sample ClientHello message, being sent from the client to the server. A simple ClientHello can look like below, with line breaks and comments added for readability.
16 03 02 00 31 # TLS Header
01 00 00 2d # Handshake header
03 02 # ClientHello field: version number (TLS 1.1)
50 0b af bb b7 5a b8 3e f0 ab 9a e3 f3 9c 63 15 \
33 41 37 ac fd 6c 18 1a 24 60 dc 49 67 c2 fd 96 # ClientHello field: random
00 # ClientHello field: session id
00 04 # ClientHello field: cipher suite length
00 33 c0 11 # ClientHello field: cipher suite(s)
01 # ClientHello field: compression support, length
00 # ClientHello field: compression support, no compression (0)
00 00 # ClientHello field: extension length (0)
All TLS messages have the TLS header which consists of a content type (first byte), version number (second two bytes) and a length (last two bytes). The content type describes the type of the message, and is 0x16 for the plaintext handshake and 0x17 för encrypted application data.
The handshake messages have a second handshake header describing the type of the handshake message (ClientHello, ServerHello etc). The handshake header also has a length field.
Following the handshake header is payload data specific to the handshake message. For ClientHello, this field contains some random data used for encryption purposes, a list of cipher suites supported by the client, information about compression support and optionally support for TLS extensions, which will not be covered in this article.
It is worth mentioning that, although the ChangeCipherSpec message is sent in the handshake procedure, it is not considered a handshake message and thus has a different content type (0x14) and does not have the handshake header. All the other messages mentioned above are handshake messages with a handshake header and a handshake content type.
Error handling
If something goes wrong during the handshake, the server can send an Alert message, which has content type 0x21. If you send incorrect TLS messages to the server (such as implementing authentication incorrectly, using incorrect decryption keys, attempting to use unsupported features, or otherwise implementing the protocol wrong)you will get alert messages indicating some kind of failure.
In this article we will mostly ignore these messages. Just be aware that if you play around with toytls and change messages you will surely get these messages from the server.
The handshake from above
In this section of the article we will try to ignore the details of how the different TLS messages are marshalled, and instead focus on the content of each message. If you are interested in the bits and bytes, please have a look at the toytls source code and/or look at Wireshark output. In the description below, we will focus on the cipher suite TLS_RSA_WITH_RC4_128_SHA. Other cipher suites may haveslightly different message contents.
>>> ClientHello
Client says it supports TLS_RSA_WITH_RC4_128_SHA, no compression, and no extensions. It sends some random data used for cryptographic purposes.
<<< ServerHello
Server responds with some random bytes of it's own, also says that it accepts the cipher suite and compression option the client suggested. The server also gives a session ID which can be used if the client wants to reuse the TLS connection later. We will skip over the session ID and sessions in this article.
<<< Certificates
Server sends a list of certificates to the client (for HTTPS, these are the certificates shown in the web browser when the website is opened).
<<< ServerHelloDone
Server sends a message indicating that the client should continue with the handshake.
>>> ClientKeyExchange
The client generates a 48 byte random encryption key called the "pre master secret" and encrypts that with the public key in the server certificate. This encrypted key is sent to the server.
>>> ChangeCipherSpec
The client sends the control message ChangeCipherSpec indicating that everything from now on will be encrypted.
>>> Finished
The client sends an encrypted message that contains a hash of all the messages in the handshake sent and received. This makes it possible to detect tampering. This will be described further below.
<<< ChangeCipherSpec
The server sends the control message ChangeCipherSpec indicating that everything from now on will be encrypted.
<<< Finished
The server responds with a Finished message on it's own, sending an encrypted hash of the handshake messages.
Cryptography in the handshake
As mentioned above the Finished message contains a hash of the handshake sequence. For TLS 1.1, this hash is a slightly unusual construct: it is the concatenation of an MD5 and a SHA1 hash. The hashed data is the content of the handshake messages, without the TLS header.This means that, when the client constructs ClientHello, it will also update the MD5-SHA1-hash with the content of the message. When it receives ServerHello, it will also update the hash, so the client and server can compare hashed in the Finished messages.
In addition to the MD5-SHA1-hash, TLS 1.1 also use a rather peculiar pseudo-random function (PRF) based on the two hashes. This pseudo-random function is used to generate keying material for the cryptographic primitives and MAC:s. The PRF is a combination of HMAC-MD5 and HMAC-SHA1. Please see toytls/hash.py for details.
When the cipher suite is TLS_RSA_WITH_RC4_128_SHA, the PRF is seeded with the pre-master secret, client random and server random data to generate a master secret. This master secret is then seeded to the PRF, again together with the random data, to create keying material. In this case, the keying material is for the RC4 stream cipher and HMAC-SHA1.
Both sides (client/server) hold encryption/MAC contexts for itself and the other side, so the client in our case has an RC4 encryption key for both the client and server.
Demonstrating the handshake
In the toytls project, please try out the script toytls-handshake.py. This will, given a server and port, complete a TLS 1.1 handshake.Fun with signatures
With the cipher suite TLS_RSA_WITH_RC4_128_SHA, the handshake is authenticated because the pre master secret is encrypted with the public key in the server ceritificate, making sure only the holder of the private key can decrypt it. From the client side, this is some proof the server is valid.When the cipher suit sets up a secure channel with Diffie Hellman, this kind of authentication is lacking. Diffie Hellman is anonymous and anyone could impersonate the server. To mitigate this, the TLS cipher suites using Diffie Hellman use RSA or some other algorithm for signing the handshake and encryption parameters. In TLS jargon, this is known as "Ephemeral Diffie Hellman", and is what DHE stands for in
the cipher suite names.
To inform the client of this signature, an extra message is introduced in the TLS handshake, known as ServerKeyExchange. This is sent between Certificate and ServerHelloDone. The signature is over the client and server random data, and the encryption parameters.
This means that a client can, using the client random field (as sent in ClientHello and used for encryption purposes), get up to 32 bytes of data signed by the server ceritificate. This is not completely useful on it's own. However, the TLS specification says that the first part of the random fields should be a timestamp.
Poor mans trusted timestamping
Trusted timestamping is cryptographically associating a timestamp with some data. Typically, you have a document you want to assert you wrote at a certain date. You take a hash of the document, send it to a trusted third party, and have them sign the hash together with a timestamp.By using the client random field in the TLS handshake, together with a Diffie Hellman cipher suite, and using the fact that TLS implementations contain a timestamp from the server, you can use Google or Facebook as a trusted third party for timestamping up to 32 bytes of data, sufficient for for example a SHA-256 hash.
In the toytls project there are two scripts, toytls-sign.py and toytls-verify.py.
$ cat statement
In this statement I claim that I wrote the following documents the year 2012:
cae3614264895c0201525ec7efff4ca6bb34dfc2 toytls/x509.py
f087da9fe9ad13e8064e6ad951b6aac8c3d54799 scripts/toytls-sign.py
Cheers
be@bjrn.se
$ sha256sum statement
08c247a658bcfe4668d853192dfe9a27c4f7bbc75ca6bc567fdc4726b1628ee8 statement
$ sha256sum statement | awk '{print $1}' |
python -c "import sys; sys.stdout.write(sys.stdin.read().strip().decode('hex'))" |
PYTHONPATH=. scripts/toytls-sign.py www.google.com 443 signed_statement
Signing message:
08c247a658bcfe4668d853192dfe9a27c4f7bbc75ca6bc567fdc4726b1628ee8
Bytes signed:
08c247a658bcfe4668d853192dfe9a27c4f7bbc75ca6bc567fdc4726b1628ee8
5013fa565da4ee2d51c9441cf307578161d1f631a3b4784f193bdd7e9d55b598
03001741049c096ff72fe6a7a1bbc5227a7b9806ab0d12129212a3e700138070
42e35fd8f60efc9cda1ecd9bbf61464a179299b43c3cf195956eedd635a7f859
8091910bf3
Signature:
7c7e3ef5aeea49674e3311112b503c6cfe8149810d5615392d0405939667bf15
95a8c9693cc8d4105a52c85615e7132467757939f72f01354a74882f59463e4d
d76b7eb4ec0de9b6922e2fc3e74336eb0ae619f90f53a2384a1465970a11a9d5
66afd335d3ae9cb2e8f7fd757d5cb5fad530923d29b3df195a963ef699711141
Server certificate:
308203213082028aa00302010202104f9d96d966b0992b54c2957cb4157d4d30
...
20e90a70641108c85af17d9eec69a5a5d582d7271e9e56cdd276d5792bf72543
1c69f0b8f9
$ PYTHONPATH=. scripts/toytls-verify.py signed_statement
Signature Verification SUCCESS
Bytes signed:
08c247a658bcfe4668d853192dfe9a27c4f7bbc75ca6bc567fdc4726b1628ee8
5013fa565da4ee2d51c9441cf307578161d1f631a3b4784f193bdd7e9d55b598
03001741049c096ff72fe6a7a1bbc5227a7b9806ab0d12129212a3e700138070
42e35fd8f60efc9cda1ecd9bbf61464a179299b43c3cf195956eedd635a7f859
8091910bf3
Bytes signed, user supplied messsage (hex):
08c247a658bcfe4668d853192dfe9a27c4f7bbc75ca6bc567fdc4726b1628ee8
Bytes signed, user supplied messsage (repr):
"\x08\xc2G\xa6X\xbc\xfeFh\xd8S\x19-\xfe\x9a'\xc4\xf7\xbb\xc7\\\xa6\xbcV\x7f\xdcG&\xb1b\x8e\xe8"
Bytes signed, server unix timestamp:
1343486550
Bytes signed, server UTC timestamp:
2012-07-28 14:42:30
Signature:
7c7e3ef5aeea49674e3311112b503c6cfe8149810d5615392d0405939667bf15
95a8c9693cc8d4105a52c85615e7132467757939f72f01354a74882f59463e4d
d76b7eb4ec0de9b6922e2fc3e74336eb0ae619f90f53a2384a1465970a11a9d5
66afd335d3ae9cb2e8f7fd757d5cb5fad530923d29b3df195a963ef699711141
Server certificate. For more details, do:
$ openssl asn1parse -inform DER -in signed_statement.certificate.der
308203213082028aa00302010202104f9d96d966b0992b54c2957cb4157d4d30
...
20e90a70641108c85af17d9eec69a5a5d582d7271e9e56cdd276d5792bf72543
1c69f0b8f9