'Is this end-to-end encryption protocol flawed?
There are many end-to-end encryption protocols in use, with Signal being one of the more popular ones. Aside from being rather complex, I'm not entirely sure Signal is better than some less complex protocols. Giving it some thought, I want to see if the following protocol (something I made up), is a feasible alternative. This protocol may already exist under some known name. If it does, please let me know.
The goal, as for any end-to-end encryption, is to encrypt data at the source location and decrypt it at the target location. Private and public keys are needed whereby the public key is used to encrypt the data and the private key is used to decrypt data. What end-to-end encryption does not support is verification, meaning that you cannot know for certain whether the person sending or receiving the data is who you believe them to be. If the data being sent is just chat messages, there is no real way of knowing whether you are chatting with the real person you believe you are. However, if the data is a live video stream, you can almost be certain you are dealing with the real person, although this can also be circumvented if the person has an identical twin, or the person is an imposter that just happens to look and sound like the real person.
The protocol I am describing here takes into account the need for verification, when it is possible to provide it.
Let's assume that we are dealing with two people: Alice and Bob. Alice wants to initiate a chat discussion with Bob. A backend server is used to act as the middle man that facilitates communication between the two people. Ultimately, the design of the end-to-end encryption protocol is to have the server play a minimal role in the communications. This means that the only data that the server should store are public keys and encrypted chat messages. It needs to store these since it is possible that one of the persons is not online and the chat messages need to be forwarded to them as soon as they go online.
When Alice wants to initiate a chat discussion with Bob, she needs to obtain Bob's public key in order to encrypt the chat message. We'll assume that Bob is about to use the app for the first time and then signs in with the backend. Bob's device generates a public key that gets stored on the server. This public key however contains some encrypted data that identifies Bob's device. Bob could be using a smartphone or a computer. Let's assume he's using a smartphone. The smartphone could have a unique identifier that uniquely identifies it from all other devices in the world. This could be a hardware identifier or even a random number that the app generates. When Bob's app goes to create the public key, it will include this identifier within the key and it will be encrypted. Whenever Bob's device receives an encrypted message from Alice, it can use his private key to decrypt the message and obtain the unique identifier. If the identifier matches with the one from his device, he can be certain that his public key was used to encrypt the message.
Until Bob's public key gets stored on the server, Alice cannot send any chat messages. All Alice would see on her device would be a status message indicating that the connection to Bob is still pending. Once the server has Bob's public key and Alice is online, it will forward the key to Alice. Like Bob, Alice has also created a public key which contains a unique identifier that identifies her device. This key is also stored on the server.
At this point, the initialization for end-to-end encryption is complete. Alica and Bob can now start to send chat messages to each other.
Alice sends a chat message that is encrypted with Bob's public key and sent to the server. The server will then forward it to Bob's device. After decrypting the message, Bob's device verifies that the message contains the unique identifier that belongs to his device. If the identifiers match, then the message had to have been encrypted with Bob's public key.
So the server has sent an encrypted message to Bob indicating who it is from (each person's user ID is used to indicate who they are). But Bob can't really be certain that it really is from the person that the server says it's from. After all, if you had a man-in-the-middle hacker, they could obtain the public key and send a message to Bob using Alice's user ID. To solve for this, we need to take a step back to the initialization phase. When Bob's device received a request to connect to Alice, it began to create a public key and included a unique identifier that got encrypted into the key. But in addition to this device ID, Alice's user ID is also included. So when Alice goes to send a chat message and Bob's device decrypts the message, it can obtain the sender ID. It can then compare the user ID that the server provided with the user ID encrypted in the message. If the two match, it most likely means that the message did originate from the same person. This however does not prevent a man-n-the-middle from obtaining Alice's public key and sending chat messages to Bob using Alice's ID. That's why verification, as described below, is important.
This means effectively that for every person Bob wants to communicate with, a unique public key is used for each person.
There is however the case where anyone can have multiple devices. When a chat message is sent, it needs to be sent to all of their devices. However, since we already decided that each device must include an encrypted device ID within their public key, any additional devices can't use the same public key. If they did, as soon as they decrypt a received message and check the device ID, it will discover that it does not match with their own device ID and the message will be rejected.
One solution to this is to repeat the initialization for each device. In essence, you are treating each device as a unique person, even though two or more devices can belong to the same person. But even though they belong to the same person, for the purpose of strict security, they really should be treated as belonging to completely different people. Even though each device acts like a different person, because the public key contains the encypted user ID, the recipient's device will notice that the user IDs are the same and use this when displaying the message from that person.
This is the basics of the end-to-end encyption. It doesn't offer any verification as to whether the person sending chat messages is the real person. Apps like WhatsApp that use Signal allow two people to compare some unique identifiers stored on each device to allow them to verify that they match up with the identifier on the other person's device. For this to work you need to have both persons compare the code on their device with the code on the other device using their eyes. Since both persons are physically seeing each other and have verified that the codes match, then the codes must be valid. These codes are actually part of the public keys used to send and receive data. If the codes don't match, then it might mean that the person that they thought they were communicating with was not the real person. WhatsApp makes this comparison easy by creating a QR code that the other person can scan with using their device. There's no need to physically read and compare a very lengthy set of numbers.
An alternative way of verifying someone is to use a video stream where both persons see each other and can confirm their identities. Instead of sending a chat message, a video feed is initiated between the two. At the start of the feed from both sides, an encrypted video message is sent from both sides. This doesn't have to be a chat message that is shown to the user. It could be the first video frame encrypted or it could just be some arbitary data indicating that the message is intended for verification purposes. The public key used to encrypt chat messages is used to encrypt this message as well. As soon as both persons see each other, a message pops up on the screen asking whether the person they are looking at is the real person. If the person agrees that the other person is the real person, they can confirm it and that person's public key will be marked as verified. This marking occurs on the device and is stored in cache. Whenever the user ever goes to chat with that person in the future, they would see a checkmark or some "verified" icon next to the person's name indicating that they have been verified.
I'm pretty sure I've overlooked something in all of this and would be grateful for any flaws that can be pointed out.
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|
