Module vulkano::memory::allocator

source ·
Expand description

In Vulkan, suballocation of DeviceMemory is left to the application, because every application has slightly different needs and one can not incorporate an allocator into the driver that would perform well in all cases. Vulkano stays true to this sentiment, but aims to reduce the burden on the user as much as possible. You have a toolbox of suballocators to choose from that cover all allocation algorithms, which you can compose into any kind of hierarchy you wish. This way you have maximum flexibility while still only using a few DeviceMemory blocks and not writing any of the very error-prone code.

If you just want to allocate memory and don’t have any special needs, look no further than the StandardMemoryAllocator.

Why not just allocate DeviceMemory?

But the driver has an allocator! Otherwise you wouldn’t be able to allocate DeviceMemory, right? Indeed, but that allocation is very expensive. Not only that, there is also a pretty low limit on the number of allocations by the drivers. See, everything in Vulkan tries to keep you away from allocating DeviceMemory too much. These limits are used by the implementation to optimize on its end, while the application optimizes on the other end.

Alignment

At the end of the day, memory needs to be backed by hardware somehow. A memory cell stores a single bit, bits are grouped into bytes and bytes are grouped into words. Intuitively, it should make sense that accessing single bits at a time would be very inefficient. That is why computers always access a whole word of memory at once, at least. That means that if you tried to do an unaligned access, you would need to access twice the number of memory locations.

Example aligned access, performing bitwise NOT on the (64-bit) word at offset 0x08:

    | 08                      | 10                      | 18
----+-------------------------+-------------------------+----
••• | 35 35 35 35 35 35 35 35 | 01 23 45 67 89 ab cd ef | •••
----+-------------------------+-------------------------+----
    ,            |            ,
    +------------|------------+
    '            v            '
----+-------------------------+-------------------------+----
••• | ca ca ca ca ca ca ca ca | 01 23 45 67 89 ab cd ef | •••
----+-------------------------+-------------------------+----

Same example as above, but this time unaligned with a word at offset 0x0a:

    | 08    0a                | 10                      | 18
----+-------------------------+-------------------------+----
••• | cd ef 35 35 35 35 35 35 | 35 35 01 23 45 67 89 ab | •••
----+-------------------------+-------------------------+----
           ,            |            ,
           +------------|------------+
           '            v            '
----+-------------------------+-------------------------+----
••• | cd ef ca ca ca ca ca ca | ca ca 01 23 45 67 89 ab | •••
----+-------------------------+-------------------------+----

As you can see, in the unaligned case the hardware would need to read both the word at offset 0x08 and the word at the offset 0x10 and then shift the bits from one register into the other. Safe to say it should to be avoided, and this is why we need alignment. This example also goes to show how inefficient unaligned writes are. Say you pieced together your word as described, and now you want to perform the bitwise NOT and write the result back. Difficult, isn’t it? That’s due to the fact that even though the chunks occupy different ranges in memory, they are still said to alias each other, because if you try to write to one memory location, you would be overwriting 2 or more different chunks of data.

Pages

It doesn’t stop at the word, though. Words are further grouped into pages. These are typically power-of-two multiples of the word size, much like words are typically powers of two themselves. You can easily extend the concepts from the previous examples to pages if you think of the examples as having a page size of 1 word. Two resources are said to alias if they share a page, and therefore should be aligned to the page size. What the page size is depends on the context, and a computer might have multiple different ones for different parts of hardware.

Memory requirements

A Vulkan device might have any number of reasons it would want certain alignments for certain resources. For example, the device might have different caches for different types of resources, which have different page sizes. Maybe the device wants to store images in some other cache compared to buffers which needs different alignment. Or maybe images of different layouts require different alignment, or buffers with different usage/mapping do. The specifics don’t matter in the end, this just goes to illustrate the point. This is why memory requirements in Vulkan vary not only with the Vulkan implementation, but also with the type of resource.

Buffer-image granularity

This unfortunately named granularity is the page size which a linear resource neighboring a non-linear resource must be aligned to in order for them not to alias. The difference between the memory requirements of the individual resources and the buffer-image granularity is that the memory requirements only apply to the resource they are for, while the buffer-image granularity applies to two neighboring resources. For example, you might create two buffers, which might have two different memory requirements, but as long as those are satisfied, you can put these buffers cheek to cheek. On the other hand, if one of them is an (optimal layout) image, then they must not share any page, whose size is given by this granularity. The Vulkan implementation can use this for additional optimizations if it needs to, or report a granularity of 1.

Fragmentation

Memory fragmentation refers to the wastage of memory that results from alignment requirements and/or dynamic memory allocation. As such, some level of fragmentation is always going to be inevitable. Different allocation algorithms each have their own characteristics and trade-offs in relation to fragmentation.

Internal Fragmentation

This type of fragmentation arises from alignment requirements. These might be imposed by the Vulkan implementation or the application itself.

Say for example your allocations need to be aligned to 64B, then any allocation whose size is not a multiple of the alignment will need padding at the end:

    | 0x040            | 0x080            | 0x0c0            | 0x100
----+------------------+------------------+------------------+--------
    | ############     | ################ | ########         | #######
••• | ### 48 B ###     | ##### 64 B ##### | # 32 B #         | ### •••
    | ############     | ################ | ########         | #######
----+------------------+------------------+------------------+--------

If this alignment is imposed by the Vulkan implementation, then there’s nothing one can do about this. Simply put, that space is unusable. One also shouldn’t want to do anything about it, since these requirements have very good reasons, as described in further detail in previous sections. They prevent resources from aliasing so that performance is optimal.

It might seem strange that the application would want to cause internal fragmentation itself, but this is often a good trade-off to reduce or even completely eliminate external fragmentation. Internal fragmentation is very predictable, which makes it easier to deal with.

External fragmentation

With external fragmentation, what happens is that while the allocations might be using their own memory totally efficiently, the way they are arranged in relation to each other would prevent a new contiguous chunk of memory to be allocated even though there is enough free space left. That is why this fragmentation is said to be external to the allocations. Also, the allocations together with the fragments in-between add overhead both in terms of space and time to the allocator, because it needs to keep track of more things overall.

As an example, take these 4 allocations within some block, with the rest of the block assumed to be full:

+-----+-------------------+-------+-----------+-- - - --+
|     |                   |       |           |         |
|  A  |         B         |   C   |     D     |   •••   |
|     |                   |       |           |         |
+-----+-------------------+-------+-----------+-- - - --+

The allocations were all done in order, and naturally there is no fragmentation at this point. Now if we free B and D, since these are done out of order, we will be left with holes between the other allocations, and we won’t be able to fit allocation E anywhere:

+-----+-------------------+-------+-----------+-- - - --+       +-------------------------+
|     |                   |       |           |         |   ?   |                         |
|  A  |                   |   C   |           |   •••   |  <==  |            E            |
|     |                   |       |           |         |       |                         |
+-----+-------------------+-------+-----------+-- - - --+       +-------------------------+

So fine, we use a different block for E, and just use this block for allocations that fit:

+-----+---+-----+---------+-------+-----+-----+-- - - --+
|     |   |     |         |       |     |     |         |
|  A  | H |  I  |    J    |   C   |  F  |  G  |   •••   |
|     |   |     |         |       |     |     |         |
+-----+---+-----+---------+-------+-----+-----+-- - - --+

Sure, now let’s free some shall we? And voilà, the problem just became much worse:

+-----+---+-----+---------+-------+-----+-----+-- - - --+
|     |   |     |         |       |     |     |         |
|  A  |   |  I  |    J    |       |  F  |     |   •••   |
|     |   |     |         |       |     |     |         |
+-----+---+-----+---------+-------+-----+-----+-- - - --+

Leakage

Memory leaks happen when allocations are kept alive past their shelf life. This most often occurs because of cyclic references. If you have structures that have cycles, then make sure you read the documentation for Arc/Rc carefully to avoid memory leaks. You can also introduce memory leaks willingly by using mem::forget or Box::leak to name a few. In all of these examples the memory can never be reclaimed, but that doesn’t have to be the case for something to be considered a leak. Say for example you have a region which you suballocate, and at some point you drop all the suballocations. When that happens, the region can be returned (freed) to the next level up the hierarchy, or it can be reused by another suballocator. But if you happen to keep alive just one suballocation for the duration of the program for instance, then the whole region is also kept as it is for that time (and keep in mind this bubbles up the hierarchy). Therefore, for the program, that memory might be a leak depending on the allocator, because some allocators wouldn’t be able to reuse the entire rest of the region. You must always consider the lifetime of your resources when choosing the appropriate allocator.

Re-exports

Modules

  • Suballocators are used to divide a region into smaller suballocations.

Structs

Enums

Traits

  • General-purpose memory allocators which allocate from any memory type dynamically as needed.

Type Aliases