AdaCore Blog

An Introduction to Memory Safety Concepts and Challenges

An Introduction to Memory Safety Concepts and Challenges

by Forough Goudarzi

Introduction

Memory safety, a key concept in software development, addresses how an application handles memory operations, such as reading, writing, allocation, and deallocation. A memory-safe application operates within the bounds of its allocated memory -- it doesn't access or modify memory locations that it's not allowed to access -- and frees memory after use. Improper management of memory could result in severe problems ranging from crashing the application to security vulnerabilities that can be exploited by attackers.

Numerous software bugs can lead an application to read or write outside its allocated memory space. To avoid such bugs, safety measures should be enforced either manually or by using memory-safe programming languages. While manual application of these measures is tedious and error-prone, the latter option provides dynamic and/or static built-in checks and tools to support the aim with minimum manual effort. The second option is in line with the US National Security Agency recommendation.

This blog starts by describing the most common bugs threatening memory safety and explaining their consequences. Then, it introduces three memory-safe languages and briefly explains their mechanisms against memory bugs.

The Most Common Memory Bugs

Out-of-bounds Write: This is the most dangerous and stubborn software weakness according to the Common Weakness Enumeration 2023 and occurs when an application writes data beyond its intended buffer. The write might happen on sensitive data e.g. data used by other subprograms or the return address of a function. Attackers can exploit overwriting a return address to control the execution flow of a program and execute their injected code.

The root cause of this bug might be issues such as uncontrolled pointer arithmetic or dangling pointers.

Use After Free: (also known as dereferencing dangling pointers) A pointer to a memory location that already has been freed is called a dangling pointer, and dereferencing it could have a variety of negative consequences. If the freed memory is reused already, dereferencing the dangling pointer might corrupt the valid data (i.e. by overwriting it), or if the new data is a function return address it can be exploited to execute malicious code. Even only reading data in the reused memory can cause sensitive information leaks.

Out-of-bounds Read: This occurs when an application reads data beyond the intended buffer, resulting in the program’s unexpected behavior or security vulnerabilities. The basic security issue is accessing sensitive data. More sophisticated threats could be the execution of arbitrary code or crashing the application.

Null Pointer Dereference: When an application attempts to access data through a pointer that does not point to a valid memory address normally the application crashes if an exception handling is not in place.

Integer Overflow or Wraparound: When arithmetic operations on an integer result in a value that exceeds the boundary of that type, that value wraps around to its maximum or minimum, depending on the operation. It can cause buffer overflow, infinite loop (if the value is a loop index), or resource (e.g. memory or CPU) exhaustion when the result of the calculation is used for resource management.

Race Condition: This refers to a situation where a resource that should be exclusively accessed is instead modified/read at the same time by concurrent threads. A race condition is due to the lack of synchronisation between threads. Unexpected states of the shared resource, application crash, and resource exhaustion could be the consequences.

Memory-Safe Programming Languages

A memory-safe programming language enforces measures to prevent memory misuse instead of relying on developers to add proper checks in their code. These measures range from the most conventional, such as bounds checking, to more sophisticated ones, such as variable ownership.

Ada: Ada is a safe programming language that has been widely used for developing safety-critical applications. Ada enforces memory safety through the language constructs that entail extensive run-time and compile-time checks. Strong typing, formal parameter modes, protected objects and safe pointers are among those constructs.

The built-in checks prevent a variety of errors including buffer overflow and out-of-bounds read/write. Strong typing and parameter modes prevent injecting wrong data that modifies memory or possibly alters control flow. Protected objects make sure that accessing shared resources is safe and free of race conditions. Pointers, the main cause of many memory bugs in other languages, are safer in Ada by provisioning safe accessibility rules and null exclusion which prevents null pointer dereferencing. Furthermore, some features provide similar functionality to pointers but without the overhead and potential memory safety issues, so that pointers do not need to be used.

Moreover, Ada's memory safety is backed by dynamic and static analysis tools for detecting errors that are often hard to track down during run time and compilation. See this blog for more information.

SPARK: SPARK is a subset of Ada and is listed as a memory-safe language by NIST. It supports Ada’s applicable memory safety features; a difference is that in SPARK, all checks are statically proven, whereas in Ada many are enforced at run time. SPARK includes a safe pointer facility based on an ownership mechanism.

Rust: Rust supports memory safety through several mechanisms, including variables’ immutability, ownership, borrowing, and bounds checking. In Rust, variables are immutable by default, which helps to protect against unintended changes. The Rust compiler manages memory using a system of ownership, which checks that each value in the program has a unique owner, and when the owner goes out of scope, the memory is automatically released. Borrowing allows multiple references to have access to a variable but only one mutable reference is possible at a time; this property is enforced in the presence of concurrency (threads), which prevents race conditions. Compile-time flow analysis checks that each pointer dereference is valid (i.e., the pointed-to value exists). There is no concept of a null pointer (and thus no way to dereference a null pointer). Run-time bounds checking verifies that the code accesses memory within the bounds of arrays and vectors, preventing buffer overflow.

Rust has a mode called “unsafe Rust”' which allows programmers to disable some safety measures for more flexibility. However, unsafe Rust doesn’t disable the borrow checker or bounds checking.

Conclusion

Over the past decades, memory safety bugs have been a persistent and severe threat to the security and safety of some of our most critical cyber systems. The programming language is a key factor: it can be part of the problem or part of the solution. This blog introduced the most common memory errors and three memory-safe languages that have built-in countermeasures against these errors. Read more about these languages in future blogs.

Posted in #Programming     #memory safety    #memory bugs    #Ada    #SPARK    #Rust   

About Forough Goudarzi

Forough Goudarzi

Forough holds a Ph.D. in Electronic and Computer Engineering from Brunel University London. She has prior experience as a research fellow and technical author, as well as a background in the telecommunications industry.