Tuesday, December 2, 2014

The DieHard Safe Memory Allocator

1 DieHard, Probabilitic Memory for Unsafe Languages

1.1 Intro

  • Memory errors such as buffer overflows, use-after-free, and uninitialized memory usage can compromise C/C++ applications
  • Tools like valgrind can help track down errors, but only work on errors observed during execution
  • Programs which detect error and fail fast can be undesirable if the end user prioritizes uptime
  • Failure-oblivious programs which drop illegal writes and give sentinels on illegal reads keep the program running but make no promises about correctness to the progammer.
  • Probabilitic memory safety is a suggested solution to this, DieHard is one such implementation


1.2 Overview

1.2.1 Has stand-alone and replicated modes, both rely on a novel memory manager which allow for computation of the exact probabilitiy of detecting/avoiding an error.

1.2.2 Allocates objects randomly across heap which is much larger than required

  • Makes buffer overflow likely to only overwrite empty space

1.2.3 Performance hit is small for most applications, and some appear to run faster when on the windows XP platform.

1.2.4 Replicated manager uses a different policy.

  • Run multiple copies of program simultaneously
  • Seed random allocators with different seeds, so they allocate into different parts of the heap
  • If the program's output differs, then it depends on the allocation order or uninitialized memory values.
  • Otherwise, runtime can say that memory errors didin't impact running process.

1.2.5 While there is a performance cost to this for sequential execution, multiple processors can be used to trade parallelism for redundancy and safety.

  • Fair trade when the parallelism is going unused by the programs.

1.3 Probabilitic Memory Safety

  • Infinite-heap semantics makes it possible for programs which would otherwise be illegal to execute correctly in the presence of errors
  • Because objects are infinitely far apart, buffer overflows are harmless
  • If frees are 'ignored' and allocated objects never overwritten, then dangling pointers are harmless.
  • Uninitialized reads still undefined because contents of new C/C++ objects not zeroed out.

1.3.1 Approximating infinite heaps

  • Impossible to implement an infinite-memory heap, but can approximate behavior.
  • By having a multiple of necessary memory allocated, can be safe as long as a pointer is not read too far past.
  • If freed objects are reclaimed randomly, unlikely to overwrite recently freed objects with old reference.

1.3.2 Detecting uninitialized reads

  • We can't perfectly detect uninitialized reads, but we can fill uninitialized heap memory with different pseudorandom values, for two different running programs
  • If they behave differently, then the different random values have been read.

1.4 Randomized Memory Management

1.4.1 Initialization

  • Initializer takes the M-factor, the multiple of the heap that will be used, which is the relative rate of redundancy.
  • Allocates from freelists seperated into power-of-two lists
    • Necessary because otherwise the allocation of small objects would fragment the heap, since they will be seperated by large buffer spaces.
  • Heap metatdata kept seperate from heap to avoid buffer overflows from destroying heap parsability.
  • Allocation works like probing a hash table, a random number is chosen, the allocator checks if it is available, and then uses open addressing to find a new place if there is a conflict.

1.4.2 Deallocation

  • Checks that the object is currently allocated before deallocation to prevent invalid/double freeing

1.4.3 Limiting buffer overflows

  • Alternative copies of copying functions(strcpy/strncpy) are used which check that copy requests do not span more than one heap object.

1.4.4 Discussion

  • Diehard does not make effort to improve locality or space use, rather concerned with safety
  • Space usage due to rounding up allocation requests to power of 2, and the spacing-out of objects
  • Saves some space through:
    • Size classes to reduce external fragmentation
    • Lower per-object overhead(single bit in allocation bitmap)
  • Potentially unacceptable for large-heap programs.

1.5 Replication

1.5.1 Allocator changes cannot prevent reading of uninitialized memory

  • Solution: run multiple program copies at once with heaps full of different pseudorandom values and ensure that they behave the same.

1.5.2 Replicas and Input

  • Diehard spaws each process, and redirects their stdin to diehard and their stdout into diehard
  • Diehard takes the stdin/stdout intended for the program
  • No system around for network IO currently, it appears

1.5.3 Voting

  • After a period of time, the outputs of a number of replicas have been pooled
  • DieHard takes the majority vote of the outputs, and it sends that to stdout

1.5.4 Discussion

  • Must intercept system calls that get the date/clock
  • Voting cost amortized to chunks
    • Latency cost?
  • If a replica loops infinitely, currently don't have a policy to handle.

1.6 Analysis

1.6.1 Equation derived that predicts likelihood of avoiding memory errors

1.6.2 Seems sound, fairly good for prototyping and debugging if not production

1.7 Experimental Results / Benchmarks

  • Linux: DieHard has higher overhead than BDW collector for high-allocation programs, but lower overhead for most other benchmarks
  • Windows: Running time performance is more or less equal to that of default allocator.
  • Allocator used in diehard faster than windows default allocator
  • Windows visual studio outputs faster binaries than g++ does because of better inlining
  • Insertion of errors into the squid web cache and into test programs shows that diehard allowed them to continue to run
Paper

No comments:

Post a Comment