Thursday 18 April 2019

Secure programming with GCC and Undefined behaviour


Buffer overflow, heap overflow, double free, null pointer dereferencing, format string, etc are the typical vulnerabilities that are being exploited. Compilers strive hard to extract every bit of performance and assume that undefined behaviours are not part of the program. This is very important for many classes of application. However, this also means that compilers can inadvertently introduce security flows if the program has undefined behaviour.


For example:

char buffer[BUFLEN];
char *buffer_end = buffer + BUFLEN;
if (buffer + len >= buffer_end || buffer + len < buffer) {
    return  ERROR;
}
The code above is trying to check for out of bound memory access however according to C standard, pointer addition will not yield a value outside the same object. Thus compiler can optimise away the second condition making the code vulnerable for buffer overflow. The issue with security vulnerability introduced by undefined behaviour is that it shows up suddenly with a new version of the compiler or with a new compiler optimisation flag without any code change. This used to be a common idiom in C/C++ to check for overflow and hence when compiler optimisations were improved, it broke many applications [1]. CERT [2] provides the list of problems and the way to rewrite them to avoid compilers from optimising unintended way. GCC also provides multiple tools that can spot potential issues and warn users. In some cases, it also allows the user to disable optimisations. Lets now look at some examples from GCC Bugzilla related to undefined behaviour and how we can detect/handle them.


Some interesting Issues

  • Infinite loop generated on the non-infinite code - PR53073

int d[16];
int SATD (void)
{
  int satd = 0, dd, k;
  for (dd = d[k = 0]; k<16; dd = d[++k])
  {
    satd += (dd < 0 ? -dd : dd);
  }
  return satd;
}

The C standard says that it is legal for a pointer to point to one element past the end but accessing that location is undefined. The compiler can, therefore, assume that the “k” can never be 16 at the point of k < 16. More from the GCC bugzilla link above.

  • Signed integer overflow - PR30475

According to C and C++ language standards overflow of a signed value is undefined behaviour and correct (or standard conforming) C/C++ program must never generate signed overflow -fno-strict-overflow /-fwrapv disables it. Unfortunately, Some of these overflow checks used to be popular as common idioms to prevent buffer overflows in many code bases.

  • Divide by zero is undefined 

#include <stdio.h>
int testdiv (int i, int k) {
    if (k == 0) printf ("found divide by zero\n");
    return (i / k);
}
int main() {
    int i = testdiv (1, 0);
    return (i);
}

This is based on error found for PostgreSQL 8.1.5 on Solaris 9 sparc with gcc-4.1 Since k is divisor, compiler assumed “k” cannot be zero print statement is optimised away 

  • Dereferencing a NULL pointer  is undefined - PR29968

static unsigned int tun_chr_poll (struct file *file, poll_table * wait)
    {
    struct tun_file *tfile = file->private_data;
    struct tun_struct *tun = __tun_get(tfile);
    struct sock *sk = tun->sk;
    unsigned int mask = 0;

    if (!tun)
        return POLLERR;
   /* …. */
   }
Example from Linux Kernel (https://lwn.net/Articles/342330/) . Since tun is deferenced compiler can optimize away the check.

  • Calling a NULL Object is undefined - PR68853

gcc-6 exposes undefined behavior in Chromium v8 garbage collector. I.e., calling a NULL object is undefined -fno-delete-null-pointer-checks allows this to be disabledso that nonconfirming code can work.

  • Reading an uninitialised variable is undefined

Compiler can assign any value to the variable and expressions derived from the variable
struct timeval tv;
unsigned long junk;

gettimeofday (&tv, NULL);
srandom ((getpid() << 16) ^ tv.tv_sec ^ tv.tv_usec ^ junk);
As shown in http://kqueue.org/blog/2012/06/25/more-randomness-or-less/. When compiled with a version of LLVM, entire seed computation is optimised away. Results of gettimeofday () and getpid () are not used at all srandom () is called with some garbage value.

  • Pointer arithmetic that wraps - PR54365

Reference:

[2]https://wiki.sei.cmu.edu/confluence/display/c/CC.+Undefined+Behavior#CC.UndefinedBehavior-ub_46