# Limited Entropy Dot ComNot so random thoughts on security featured by Eloi Sanfèlix

3Sep/092

## Crypto Series: Introduction to stream ciphers

Today we're gonna step a little further in our Crypto series. We'll see the main properties of the so-called Stream Ciphers, how they work and some things that should be taken into account when they are used.

Later in this series, we'll see how Linear Feedback Shift Registes (LFSR) work and we'll see one of the most used stream ciphers, together with an example of wrong usage.

General features

As opposed to block ciphers, a stream cipher does not divide the plaintext in big blocks where the cipher is applied, but instead it encrypts individual information elements such as bits, bytes or characters.

Generally, based on an initial key a stream cipher derives a series of characters known as the key stream which is then mixed with the input data, or data stream, generally by means of an exclusive or operation.

This is based on the One Time Pad (OTP) concept, which we didn't mention earlier in this blog but most likely you have already heard of it. The OTP is a cryptographic algorithm which simply generates a random key as long as the message to be encrypted and mixes both, by means of a XOR operation in digital communications.

The properties of the OTP algorithm offer absolute confidentiality in the sense that the cryptogram does not reveal any information about the message contents, at the cost of a key as long as the message itself. Obviously, this makes key management not practical at all: where before we had the problem of sending a message of length L in a secure way, now we can send the encrypted message without any fear but still we need to send a key of length L... which leaves us with an equivalent problem!

Following this philosophy of using random keys as long as messages, the stream ciphers that we are analyzing today were invented. To that end, as I explained above, they try to derive a series of pseudo-random characters based on a secret key. This way, one obtains similar properties to the OTP algorithm reducing the complexity of key management... but of course this also reduces the randomness of the key stream.

Stream cipher classification

Stream ciphers are usually divided into two groups: synchronous and self-synchronizing stream ciphers. Most stream ciphers proposed so far are synchronous ciphers, where the keystream is generated independently of the plaintext and the ciphertext. Therefore, these ciphers require both ends of the communication to be sinchronized, and if a single digit of the cryptogram is lost the rest of the plaintext will be unrecoverable (unless error-correcting techniques are used).

Further, in these systems errors are not propagated besides one single character. This allows an active attacker to modify the contents of the ciphertext without detection. For instance, in a system as the one commented at the beginning, where a XOR of the keystream and the data stream is performed, one could just flip a bit in the decrypted plaintext by flipping the same bit in the cyphertext.

On the other hand, self-synchronizing ciphers generate a keystream dependant on the key and part of the previous ciphertext. Therefore, since a given character depends on the previous ciphertext character, if an error occurs it is possible to resynchronize after some time: we just need to discard as many characters as needed so that the keystream doesn't depend anymore on the corrupted ciphertext.

Some security considerations

It is extremely important in a stream cipher that the key stream does not frequently repeat, especially on those ciphers which use an additive function (i.e. XOR) to mix keystream and data stream. The reason is quite simple: if a given message is compromised but the key is not, any message that uses the same key stream could be compromised simply XORing the known keystream and the ciphertext.

Further, in case a message is not compromised but one obtains several messages encrypted with the same key stream, XORing both messages it is possible to remove the influence of the key stream: we would have the XOR of both initial plain texts. This way, we could possibly obtain information on the transmitted messages (structure, statistical properties, ...) that would help us to break the messages.

We could see an example of this method at Campus Party, where Cucaracha decrypted two ciphertexts encrypted using RC4 based on the knowledge that the language was Spanish and guessing the first message and applying the resulting key stream to the second message to see whether the output was sensible or not. It's completely logical, although requires a detailed work and I have to admit that at first I was shocked when he told me that he got the messages but not the key 😆

To avoid this kind of problems, normally an initialization vector (IV) is used to initiate the cipher and have a different keystream each time. Therefore, it is important that the IV is not reused very often.