在Rust中可视化内存管理

这是”内存管理”系列的一部分
  1. [[揭开现代编程语言中内存管理的神秘面纱]]
  2. [[在JVM中可视化内存管理(Java、Kotlin、Scala、Groovy、Clojure)]]
  3. [[在V8引擎中可视化内存管理 (Javascript、NodeJS、Deno、WebAssembly)]]
  4. [[在Go语言中可视化内存管理]]
  5. [[在Rust中可视化内存管理]]
  6. [[避免NodeJS中的内存泄露:性能最佳实践]]

🚀 Visualizing memory management in Rust

在这个由多部分组成的系列中,我旨在揭开内存管理背后的概念的神秘面纱,并深入了解一些现代编程语言中的内存管理。我希望这个系列能让您对这些语言在内存管理方面发生的事情有所了解。

在本章中,我们将了解 Rust 编程语言的内存管理。 Rust 是一种静态类型和编译的系统编程语言,如 C 和 C++。 Rust 是内存和线程安全的,没有运行时或垃圾收集器。我之前也写过我对 Rust 的第一印象。

如果您还没有阅读本系列的第一部分,请先阅读它,因为我解释了栈和堆内存之间的区别,这将有助于理解本章。

这篇文章基于 Rust 1.41 官方实现,概念细节可能会在未来的 Rust 版本中发生变化

与我们迄今为止在本系列中看到的其他语言相比,Rust 非常独特,让我们看看如何。

Rust 内存结构 Rust internal memory structure

首先,让我们看一下 Rust 的内部内存结构。

截至目前,Rust 在语言规范中没有定义的内存模型,内存结构非常简单。

每个 Rust 程序进程都被操作系统(OS)分配了一些虚拟内存,这是该进程可以访问的总内存。

![[Pasted image 20220722171552.png]]

与我们在前几章中看到的 JVM、V8 和 Go 的内存结构相比,这非常简单。如您所见,由于不涉及垃圾收集(GC),因此没有分代内存或任何复杂的子结构。原因是 Rust 在运行时使用所有权模型而不是使用任何类型的 GC 来管理内存作为程序执行的一部分。

让我们看看不同的内存是什么:

Heap

这是存储所有动态数据(在编译时无法计算其大小的任何数据)的位置。这是最大的内存块,也是由 Rust 的所有权模型管理的部分。

  • Box: Box 类型是 Rust 中堆分配值的抽象。调用 Box::new 时会分配堆内存。 Box<T> 持有指向为类型 T 分配的堆内存的智能指针,并且引用保存在堆栈中。

Stack

这是堆栈内存区域,每个线程有一个堆栈。这是默认分配静态值的地方。静态数据(编译时已知的数据大小)包括函数帧、原始值、结构和指向堆中动态数据的指针。


Rust memory usage (Stack vs Heap)

Now that we are clear about how memory is organized let’s see how Rust uses Stack and Heap when a program is executed.

Let’s use the below Rust program, the code is not optimized for correctness hence ignore issues like unnecessary intermediatory variables and such, the focus is to visualize Stack and Heap memory usage.

struct Employee<'a> {
    // The 'a defines the lifetime of the struct. Here it means the reference of `name` field must outlive the `Employee`
    name: &'a str,
    salary: i32,
    sales: i32,
    bonus: i32,
}

const BONUS_PERCENTAGE: i32 = 10;

// salary is borrowed
fn get_bonus_percentage(salary: &i32) -> i32 {
    let percentage = (salary * BONUS_PERCENTAGE) / 100;
    return percentage;
}

// salary is borrowed while no_of_sales is copied
fn find_employee_bonus(salary: &i32, no_of_sales: i32) -> i32 {
    let bonus_percentage = get_bonus_percentage(salary);
    let bonus = bonus_percentage * no_of_sales;
    return bonus;
}

fn main() {
    // variable is declared as mutable
    let mut john = Employee {
        name: &format!("{}", "John"), // explicitly making the value dynamic
        salary: 5000,
        sales: 5,
        bonus: 0,
    };

    // salary is borrowed while sales is copied since i32 is a primitive
    john.bonus = find_employee_bonus(&john.salary, john.sales);
    println!("Bonus for {} is {}", john.name, john.bonus);
}

All values in Rust are allocated on the Stack by default. There are two exceptions to this:

  1. When the size of the value is unknown, i.e Structs like String and Vectors which grows in size over time or any other dynamic value
  2. When you manually create a Box<T> value like Box::new("Hello"). A box is a smart pointer to a heap-allocated value of type T. When a box goes out of scope, its destructor is called, the inner object is destroyed, and the memory on the Heap is freed.

In both exception cases, the value will be allocated on Heap and its pointer will live on the Stack.

Let us visualize this. Click on the slides and move forward/backward using arrow keys to see how the above program is executed and how the Stack and Heap memory is used:

Note: If the slides look cut off at edges, then click on the title of the slide or here to open it directly in SpeakerDeck.

As you can see:

  • Main function is kept in a “main frame” on the Stack
  • Every function call is added to the Stack memory as a frame-block
  • All static variables including arguments and the return value is saved within the function frame-block on the Stack
  • All static values regardless of type are stored directly on the Stack. This applies to global scope as well
  • All dynamic types created on the Heap and is referenced from the Stack using smart pointers. This applies to the global scope as well. Here we explicitly made the name dynamic to avoid it going to the Stack as having a fixed-length string value will do that
  • The struct with static data is kept on the Stack and any dynamic value in it is kept on the Heap and is referenced via pointers
  • Functions called from the current function is pushed on top of the Stack
  • When a function returns its frame is removed from the Stack
  • Unlike Garbage collected languages, once the main process is complete, the objects on the Heap are destroyed as well, we will see more about this in the following sections

The Stack as you can see is automatically managed and is done so by the operating system rather than Rust itself. Hence we do not have to worry much about the Stack. The Heap, on the other hand, is not automatically managed by the OS and since its the biggest memory space and holds dynamic data, it could grow exponentially causing our program to run out of memory over time. It also becomes fragmented over time slowing down applications. This is where Rust’s ownership model steps in to automatically manage the Heap memory

Note: you can find the code I used to identify where a value ends up here


Rust Memory management: Ownership

Rust has one of the most unique ways of managing Heap memory and that is what makes Rust special. It uses a concept called ownership to manage memory. It is defined by a set of rules

  1. Every value in Rust must have a variable as its owner
  2. There must be only one owner for a variable at any given time
  3. When the owner goes out of scope the value will be dropped freeing the memory

The rules are applicable regardless of the value being in Stack or Heap memory. For example, in the below example the value of foo is dropped as soon as the method execution completes and the value of bar is dropped right after the block execution.

1
2
3
4
5
6
7
8
9
10
11
12
13
fn main() {
    let foo = "value"; // owner is foo and is valid within this method
    // bar is not valid here as its not declared yet

    {
        let bar = "bar value"; // owner is bar and is valid within this block scope
        println!("value of bar is {}", bar); // bar is valid here
        println!("value of foo is {}", foo); // foo is valid here
    }

    println!("value of foo is {}", foo); // foo is valid here
    println!("value of bar is {}", bar); // bar is not valid here as its out of scope
}

These rules are checked by the compiler at compile-time and the freeing of memory happens at runtime along with program execution and hence there is no additional overhead or pause times here. So by scoping variables carefully, we can make sure the memory usage is optimized and that is also why Rust lets you use block scope almost everywhere. This might sound simple but in practice, this concept has deep implications in how you write Rust programs and it takes some getting used to. The Rust compiler does a great job of helping you along the way as well.

Due to the strict ownership rules, Rust lets you change the ownership from one variable to another and is called a move. This is automatically done when passing a variable into a function or when creating a new assignment. For static primitives, a copy is used instead of move.

There are a few more concepts related to memory management that play along with Ownership to make it effective

RAII

RAII stands for Resource acquisition is initialization. This is not new in Rust, this is borrowed from C++. Rust enforces RAII so that when a value is initialized the variable owns the resources associated and its destructor is called when the variable goes out of scope freeing the resources. This ensures that we will never have to manually free memory or worry about memory leaks. Here is an example

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
fn create_box(i: u32) {
    // Allocate a string on the heap
    let _var_i = Box::new(format!("Hello {}", i));
    // `_var_i` is destroyed here, and memory gets freed
}

fn main() {
    // Allocate an integer on the heap
    let _var_1 = Box::new(5u32);
    // A nested scope:
    {
        // Allocate a string on the heap
        let _var_2 = Box::new("Hello 2");
        // `_var_2` is destroyed here, and memory gets freed
    }

    // Creating lots of boxes
    // There's no need to manually free memory!
    for i in 0u32..1_000 {
        create_box(i);
    }
    // `_var_1` is destroyed here, and memory gets freed
}

Borrowing & Borrow checker

In Rust we can pass a variable by either value or by reference and passing a variable by reference is called borrowing. Since we can have only one owner for a resource at a time, we have to borrow a resource to use it without taking ownership of it. Rust compiler has a borrow checker that statically ensures that references point to valid objects and ownership rules are not violated. Here is a simplified version of the Rust official example.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
// This function takes ownership of the passed value
fn take_ownership(value: Box<i32>) {
    println!("Destroying box that contains {}", value);
}

// This function borrows the value by reference
fn borrow(reference: &i32) {
    println!("This is: {}", reference);
}

fn main() {
    // Create a boxed and a stacked variable
    let boxed = Box::new(5_i32);
    let stacked = 6_i32;

    // Borrow the contents of the box. Ownership is not taken,
    // so the contents can be borrowed again.
    borrow(&boxed);
    borrow(&stacked);

    {
        // Take a reference to the data contained inside the box
        let _ref_to_boxed: &i32 = &boxed;

        // Error!
        // Can't destroy `boxed` while the inner value is borrowed later in scope.
        take_ownership(boxed);

        // Attempt to borrow `_ref_to_boxed` after inner value is destroyed
        borrow(_ref_to_boxed);
        // `_ref_to_boxed` goes out of scope and is no longer borrowed.
    }

    // `boxed` can now give up ownership to `take_ownership` method and be destroyed
    take_ownership(boxed);
}

Variable Lifetimes

The lifetime of variables is another very important concept to make the ownership model work. It is a construct used by the borrow checker in order to ensure that all references to an object is valid. This is checked during compile time. The lifetime of a variable begins when its initialized and ends when it is destroyed. Lifetime is not the same as the scope.

This might sound straight forward but lifetimes get much more complex once functions and structs with references come into play and once they do then we would need to start using lifetime annotations to let the borrow checker know how long references are valid. Sometimes the compiler can infer lifetimes, but not always. I’m not going to details here as its not in the scope of this article

Smart pointers

Pointers are nothing but a reference to a memory address on the Heap. Rust has support for pointers and lets us reference and dereference them using & and * operators. Smart pointers in Rust are like pointers but with additional metadata capabilities. Like RAII this is another concept taken from C++. Unlike pointers which are references that only borrow data, smart pointers own the data they point to. BoxString and Vec are examples of smart pointers in Rust. You can also write your own smart pointers using structs.

Ownership visualization

Now that we have seen different concepts used for Ownership, let us visualize it, unlike other languages where we visualized the data in Heap, here it is much easier if we look at the code itself.

Lifetime of variables


Conclusion

This post should give you an overview of the Rust memory structure and memory management. This is not exhaustive, there are a lot more advanced concepts and the implementation details keep changing from version to version. Unlike Garbage collected languages, where you need not understand the memory management model to use the language, in Rust, it is required to understand how Ownership works in order to write programs. This post is just a starting step, I recommend that you dive into the Rust documentation to learn more about these concepts.

I hope you had fun learning this, stay tuned for the next post in the series.


References


If you like this article, please leave a like or a comment.

You can follow me on Twitter and LinkedIn.

Cover image inspired by https://hacks.mozilla.org/2019/01/fearless-security-memory-safety/



Post 5 of 6 in series “memory-management”.

« Prev post in seriesNext post in series »

« Forget NodeJS! Build native TypeScript applications with Deno 🦖Avoiding Memory Leaks in NodeJS: Best Practices for Performance »


Never miss a story from us, subscribe to our newsletter

 

Explore →

languages (15) programming (15) javascript (13) java (10) beginners (8) rust (8) go (7) concurrency (6) linux (5) microservices (5) typescript (5) computerscience (4) development (4) fedora (4) functional (4) thepragmaticprogrammer (4) async (3) jhipster (3) kubernetes (3) azure (2) career (2) codequality (2) deno (2) discuss (2) docker (2) garbagecollection (2) gnome (2) golang (2) kde (2) node (2) nodejs (2) react (2) showdev (2) webdev (2) architecture (1) bash (1) blogging (1) books (1) clojure (1) codenewbie (1) desktop (1) developerexperience (1) devops (1) devrel (1) distributedsystems (1) engineering (1) gaming (1) ide (1) interview (1) istio (1) jdk (1) jekyll (1) js (1) jvm (1) kotlin (1) medium (1) memory-management (1) motivation (1) multithreading (1) ohmyzsh (1) openjdk (1) opensource (1) patternmatching (1) polyglot (1) pragmatic (1) python (1) ruby (1) scala (1) svelte (1) tech (1) terminal (1) terraform (1) ubuntu (1) v8 (1) vr (1) vscode (1) web (1) webassembly (1) webassembly (1) windows (1) womenintech (1) writing (1) zsh (1)

Copyright © 2022 Deepu K Sasidharan
 Creative Commons LicenseContent licensed under a Creative Commons Attribution 4.0 International License.

Mediumish Jekyll Theme by WowThemes.net | Domain by  JS.ORG Logo| Hosted with  by Github


已发布

分类

来自

标签:

评论

发表回复

您的邮箱地址不会被公开。 必填项已用 * 标注