Saturday 6 January 2018

New Cache Side Channel Attack Using Processor Speculation

Computing systems relies on privilege separation to protect memory of different processes. In simpler terms, a process can access memory pages mapped to its address space via MMU and it is the kernel which runs in privilege mode can do so.

Cache side channel attacks are not new. They have been demonstrated in various forms over the years. In the newer version, it is the processor speculation that is used in side channel attack.

Cache Access

Some points about cache hierarchy before go into the specifics of the attack:
  • Cache hierarchy is shared between processes. 
  • Cache is usually set associative. Cache is divided as sets and in each sets there will be multiple ways. When accessing the cache line, index part of the address is used to go into the set. Then a parallel search of all the ways are performed for the tag part to see if there is a match. If there is a match and the cache line is valid, we hit the cache
  • Cache can be indexed/tagged using physical/virtual addresses - four combinations
  • Cache level can be inclusive or exclusive


Figure above shows a 4-way set associative cache.  There is 256 sets in the cache so 8 bits from the address is used to select one. Once selected, any of the 4-ways can hold the line we are interested so 4 parallel checks are done for the tags and enabled by valid bits in the cache line.

Cache Side-channel Attack

Since cache is shared between processes, a process can manipulate lines in a cache ways such that it know what cache lines in the set is used used by victim process and what addresses are valid for victim. Flush and Reload is one of the side channel attack that is used in last level inclusive caches. It flushes all the cache lines and then reloads after the other process is run. If the access timing such that the memory is hit in the cache, we know the address is valid and victim process has loaded the cache into the cache.

Using Processor Out-of-order Vulnerability to Access memory Belonging to Others

In the newer meltdown attack, speculative memory accesses of the modern processors are used to load the memory in flush and reload attack. That is, in the modern highly pipelined superscalar architecture, processor architects relies in speculative execution to keep the pipeline busy. They use branch predictors to predict not only the branch direction but also the address taken. However, the predicted path will be retired and committed only after the branch has actually been resolved. But as shown by the authors of meltdown attack, the cache memory state will be modified if the speculated path had memory loads and the prediction turned out to be wrong. That is memory hierarchy state is not rolled back.

When a process is executing the address space of the process has the user space and the kernel space mapped. However user space cannot access the kernel address space and require kernel privilege to access this. This usually happens via syscall interface.

We can have a process requesting a cache line along a path that will be speculated but not actually executed. This can be done in many ways like having a branch that will never be taken but by tricking the branch predictor into predicting in other direction to speculate. This can also be used with a preceding exception that will throw and handling this. All the details are in the paper which is a must read [1].

Examples from Google blog [4]


In simpler terms, this Speculative load can be used to load memory belonging to the kernel via kernel calls. For example if there is a call where we provide a buffer to read and an offset for some data structures and if the kernel will validate this offset before loading. Speculation can then be used to load the memory.

struct array {
 unsigned long length;
 unsigned char data[];
};
struct array *arr1 = ...; /* small array */
struct array *arr2 = ...; /* array of size 0x400 */
/* >0x400 (OUT OF BOUNDS!) */
unsigned long untrusted_offset_from_caller = ...;
if (untrusted_offset_from_caller < arr1->length) {
 unsigned char value = arr1->data[untrusted_offset_from_caller];
 unsigned long index2 = ((value&1)*0x100)+0x200;
 if (index2 < arr2->length) {
   unsigned char value2 = arr2->data[index2];
 }
}


For example, for the code above from [4], after the code is speculatively executed, by checking arr2->data[0x200] is cached or not will let us know what the value of arr1->data[untrusted_offset_from_caller]&1 is.

In kernel eBPF bytecode interpreter and JIT engine is used to create code like above and that is used to leak kernel memory. Complete information about the attack can be found in [4].

Kernel Page Table Isolation (KPTI) 

Kernel page table isolation from user space is proposed as a way to prevent this attack [2][3]. Since this is a hardware issue, fixing this in software is costly in-terms of performance. It has been reported that some of the syscall heavy applications can suffer significant performance losses. We live in a world where isolation is very important especially when cloud and virtualisation are the norms and preventing access to one users data fro the other is very important. So this is a necessary thing to do even if the performance penalty is significant.


Reference:

[1] https://meltdownattack.com/meltdown.pdf
[2] https://lkml.org/lkml/2017/12/4/709
[3] https://gruss.cc/files/kaiser.pdf
[4] https://googleprojectzero.blogspot.com.au