Limited Entropy Dot Com Not so random thoughts on security featured by Eloi Sanfèlix

3Sep/120

HW Security: fault injection techniques

Posted by Eloi Sanfèlix

So you read my last post and were left wondering how the heck you would be able to inject temporary faults into hardware devices? Here is your answer 🙂

In that post, I explained how to extract keys from cipher implementations assuming we could somehow inject faults during the execution of the cipher. Besides DFA attacks, I also said we could achieve something similar to what we do with software protections (i.e. modifying control flow, bypassing checks, etc.) using fault injection techniques. I thought it was worth giving a few examples of how to inject faults in real hardware to complete the picture.

When hardware is designed, it is engineered to work under certain conditions of temperature, input voltage ranges, clock frequencies, etc. The hardware is tested under those conditions and is supposed to function in that range... and there are no guarantees that it will operate correctly if you bring it outside them.

I guess you follow my reasoning already 🙂 So if we want to inject faults into hardware, a good place to start looking is exactly in those gray areas around the operating conditions. Of course, we want the chip to be functioning properly most of the time, and we want it to fail at the precise moment at which it is computing something sensitive (say a secure boot check, or an RSA-CRT signature). Thus, we probably need to have some control over the timing, and inject the fault only temporarily.

In this post I am introducing from an intuitive perspective three ways of injecting faults: voltage, clock and laser/optical glitching.

Voltage glitching

The first example I want to touch on is that of voltage (or VCC) glitching. In this case, we typically run the chip at its nominal voltage (say 3.3V), and whenever we want to inject a fault, we drop voltage down to e.g. 1V.

Example of voltage glitching. The supply voltage is set to 0.8V during a short moment of time.

At this moment, the input voltage to certain gates within the chip will be too low due to the lack of supply voltage. Thus, these gates will receive an input voltage which is below the threshold that indicates whether the signal is a zero or a one, no matter what value it was supposed to be.

Then we increase the voltage again to the nominal voltage of 3.3V, and we have a functioning chip that just failed to execute one of its operations. For instance, it failed to execute a conditional jump and fell through to the code that we wanted to have executed.

The trick here is to find the proper parameters for the glitch: voltage drop (do we go to 0V, to 0.4V, to 1V ...), length of the glitch (a few nanoseconds, a few microseconds?) and the timing. Typically, if voltage drop and length of glitch are too small, the chip will function properly. If they are too large, it will just die (mute, reset, or maybe even physically damaged). Of course, if the timing is wrong then the attacker will never see the effects he wants to see.

As a protection against this kind of glitching, most modern smart cards and some embedded devices incorporate voltage sensors that detect whether the voltage went below a certain value or not. However, this attack is still effective against a wide range of products.

Clock glitching

Clock glitching is similar to VCC glitching in the sense that it affects another critical parameter of the chip that can be controlled by the attacker. In this case, what we do is injecting spurious clock cycles that are way shorter than the original clock cycle.

Example of clock glitching. A very short spurious clock cycle is inserted at the beginning of a normal cycle.

Since the internal logic of the chip operates based on its clock, a short clock cycle will trigger a new operation before the results of the previous one were completely computed or propagated through the device.

Imagine you have to multiply two values, and then add a third value to them. Normally, multiplying values takes longer than adding them up. Thus, the clock frequency for a chip that only performs these two operations would be long enough for the multiplication to occur and its result to be ready at the input of the next stage, since that is the critical operation.

Now, if I tell you to start adding up before you received the multiplication result, you will be using invalid (old?) data instead of the correct result. Thus, you will fail at computing the correct result.

Clock glitching exploits exactly that situation. Again, finding the right parameters in this case is the key to success.

As for hardware level protections, frequency sensors as well as using internally generated clocks (using on-die oscillators) are generally the most common ways to protect against clock glitching.

Additionally, fast clocks make these attacks less practical for attackers, since they need to inject even faster clock cycles and synchronize their attack at a higher speed.

This is why clock glitching is less effective nowadays: most high-end smart cards use their own on-die clocks, and embedded systems require much higher clock speeds.

Optical glitching

After clock and VCC glitching, we move to the real king of current fault injection attacks. Optical fault injection, or most commonly Laser fault injection, uses a light beam to inject faults into semiconductor devices.

How is this possible? Well, light (physicists, don't kill me!) basically consists of a number of photons carrying a certain amount of energy. Roughly, when these photons reach a semiconductor (typically silicon in electronic devices), their energy is absorbed by the semiconductor.

Given enough energy, electrons that would otherwise be still within the semiconductor will start to move, creating current. So, for our chips, this means that some of the transistors in the chip will actually switch when they should not!

The big difference between this fault injection technique and the previous ones is that in this case we actually have spatial selectivity (or resolution): we can choose which parts of the chip we attack by pointing the laser beam to them.

Of course, this is very powerful but at the same time it adds extra complexity to the attack: now you need to find the sensitive spots in the chip. As before, there are a number of parameters one needs to play with in order to successfully inject faults: glitch timing, glitch length, wavelength of the injected light and amount of energy injected.

Also, this attack is semi-invasive: we need to open up the chip package so that the light radiation can reach the die. Otherwise, the light will be blocked by the package or the plastic around the smart card die. Thus, this attack provides additional power at the cost of additional complexity, as usual.

In terms of hardware level protections, this is also the most difficult attack to prevent. Typically light sensors are scattered around the chip, but manufacturers cannot place sensors everywhere (that's expensive!) so there is always open spots.

At the end of the day, fault injection protection requires a combination of hardware and software prevention and detection mechanisms: typically sensors at the hardware level and double-checks and redundancy at the software side.

Due to the difficulty of completely preventing this kind of attacks, fault injection attacks are nowadays one of the main threats to secure hardware. Additionally, this difficulty together with the physical nature of the attacks also means that simulating them is typically not enough to assure appropriate protection levels, making fault injection testing key for secure hardware.

25Aug/122

Crypto Series: Differential Fault Analysis by examples

Posted by Eloi Sanfèlix

So, after more than a year without writing anything here, I was bored today and thought it would be nice to share a new piece on attacking cryptographic implementations here 🙂

Differential Fault Analysis (DFA) attacks are part of what is known as fault injection attacks. This is, they are based on forcing a cryptographic implementation to compute incorrect results and attempt to take advantage from them. With fault injection attacks (also often called active side channel attacks) one can achieve things like unauthenticated access to sensitive functionality, bypassing secure boot implementations, and basically bypassing any security checks an implementation performs.

With DFA attacks, one is able to retrieve cryptographic keys by analyzing correct/faulty output pairs and comparing them. Of course, this assumes you are able to inject faults somehow... which is often true in hardware implementations: gaming consoles, STBs, smart cards, etc. At the software level, one can achieve similar things by debugging the implementation and changing data or by patching instructions... but this is something we have been doing for a long time, haven't we? 🙂 I often say that fault injection attacks are the analog version of 'nopping' instructions out in a program, although we often do not know exactly what kind of faults we are injecting (i.e. we often miss a fault model, but we still successfully attack implementations in this way).

There are ways to protect against this kind of attack as an application programmer, but this is not the objective of this post. In the remainder of this post, I will explain two powerful DFA attacks on two modern cryptographic algorithms: RSA and (T)DES. For some information on protecting from these attacks as a programmer, take a look at these slides. If there is some interest, I will outline the most common techniques to perform fault attacks in a future post.

10May/081

Fault injection: Ataque a RSA-CRT

Posted by Eloi Sanfèlix

Después de mucho tiempo en el letargo, volvemos a la carga con un ejemplo de inyección de fallos en el algoritmo RSA empleando el Teorema Chino del Resto ( Chinese Remainder Theorem ). Este teorema permite que si tenemos un par de ecuaciones tal que

x \equiv x_p \pmod{p}

x \equiv x_q \pmod{q}

Con p y q primos, se pueda calcular x ( mod p·q ) a partir de ellos y dos resultados auxiliares.

Por ello, el algoritmo RSA se puede dividir de una potencia modular con un módulo enorme a dos operaciones modulares de módulos de tamaño aproximadamente la mitad del primero. Con esto se consigue una mejora de rendimiento, lo cual es fundamental en aplicaciones con recursos limitados como smart cards. Además, los resultados auxuliares pueden ser precalculados, con lo cual se pueden cargar en la tarjeta al mismo tiempo que la clave y reducir la carga.

Sin embargo, en estos entornos es posible inyectar fallos tal y como expliqué en esta entrada. ¿Y qué tiene esto que ver con las implementaciones de RSA usando el CRT? Como vamos a ver, mucho 🙂