Expand description
In Vulkan, suballocation of DeviceMemory
is left to the application, because every
application has slightly different needs and one can not incorporate an allocator into the
driver that would perform well in all cases. Vulkano stays true to this sentiment, but aims to
reduce the burden on the user as much as possible. You have a toolbox of suballocators to
choose from that cover all allocation algorithms, which you can compose into any kind of
hierarchy you wish. This way you have maximum flexibility while still only using a few
DeviceMemory
blocks and not writing any of the very error-prone code.
If you just want to allocate memory and don’t have any special needs, look no further than the
StandardMemoryAllocator
.
Why not just allocate DeviceMemory
?
But the driver has an allocator! Otherwise you wouldn’t be able to allocate DeviceMemory
,
right? Indeed, but that allocation is very expensive. Not only that, there is also a pretty low
limit on the number of allocations by the drivers. See, everything in Vulkan tries to keep you
away from allocating DeviceMemory
too much. These limits are used by the implementation to
optimize on its end, while the application optimizes on the other end.
Alignment
At the end of the day, memory needs to be backed by hardware somehow. A memory cell stores a single bit, bits are grouped into bytes and bytes are grouped into words. Intuitively, it should make sense that accessing single bits at a time would be very inefficient. That is why computers always access a whole word of memory at once, at least. That means that if you tried to do an unaligned access, you would need to access twice the number of memory locations.
Example aligned access, performing bitwise NOT on the (64-bit) word at offset 0x08:
| 08 | 10 | 18
----+-------------------------+-------------------------+----
••• | 35 35 35 35 35 35 35 35 | 01 23 45 67 89 ab cd ef | •••
----+-------------------------+-------------------------+----
, | ,
+------------|------------+
' v '
----+-------------------------+-------------------------+----
••• | ca ca ca ca ca ca ca ca | 01 23 45 67 89 ab cd ef | •••
----+-------------------------+-------------------------+----
Same example as above, but this time unaligned with a word at offset 0x0a:
| 08 0a | 10 | 18
----+-------------------------+-------------------------+----
••• | cd ef 35 35 35 35 35 35 | 35 35 01 23 45 67 89 ab | •••
----+-------------------------+-------------------------+----
, | ,
+------------|------------+
' v '
----+-------------------------+-------------------------+----
••• | cd ef ca ca ca ca ca ca | ca ca 01 23 45 67 89 ab | •••
----+-------------------------+-------------------------+----
As you can see, in the unaligned case the hardware would need to read both the word at offset 0x08 and the word at the offset 0x10 and then shift the bits from one register into the other. Safe to say it should to be avoided, and this is why we need alignment. This example also goes to show how inefficient unaligned writes are. Say you pieced together your word as described, and now you want to perform the bitwise NOT and write the result back. Difficult, isn’t it? That’s due to the fact that even though the chunks occupy different ranges in memory, they are still said to alias each other, because if you try to write to one memory location, you would be overwriting 2 or more different chunks of data.
Pages
It doesn’t stop at the word, though. Words are further grouped into pages. These are typically power-of-two multiples of the word size, much like words are typically powers of two themselves. You can easily extend the concepts from the previous examples to pages if you think of the examples as having a page size of 1 word. Two resources are said to alias if they share a page, and therefore should be aligned to the page size. What the page size is depends on the context, and a computer might have multiple different ones for different parts of hardware.
Memory requirements
A Vulkan device might have any number of reasons it would want certain alignments for certain resources. For example, the device might have different caches for different types of resources, which have different page sizes. Maybe the device wants to store images in some other cache compared to buffers which needs different alignment. Or maybe images of different layouts require different alignment, or buffers with different usage/mapping do. The specifics don’t matter in the end, this just goes to illustrate the point. This is why memory requirements in Vulkan vary not only with the Vulkan implementation, but also with the type of resource.
Buffer-image granularity
This unfortunately named granularity is the page size which a linear resource neighboring a non-linear resource must be aligned to in order for them not to alias. The difference between the memory requirements of the individual resources and the buffer-image granularity is that the memory requirements only apply to the resource they are for, while the buffer-image granularity applies to two neighboring resources. For example, you might create two buffers, which might have two different memory requirements, but as long as those are satisfied, you can put these buffers cheek to cheek. On the other hand, if one of them is an (optimal layout) image, then they must not share any page, whose size is given by this granularity. The Vulkan implementation can use this for additional optimizations if it needs to, or report a granularity of 1.
Fragmentation
Memory fragmentation refers to the wastage of memory that results from alignment requirements and/or dynamic memory allocation. As such, some level of fragmentation is always going to be inevitable. Different allocation algorithms each have their own characteristics and trade-offs in relation to fragmentation.
Internal Fragmentation
This type of fragmentation arises from alignment requirements. These might be imposed by the Vulkan implementation or the application itself.
Say for example your allocations need to be aligned to 64B, then any allocation whose size is not a multiple of the alignment will need padding at the end:
| 0x040 | 0x080 | 0x0c0 | 0x100
----+------------------+------------------+------------------+--------
| ############ | ################ | ######## | #######
••• | ### 48 B ### | ##### 64 B ##### | # 32 B # | ### •••
| ############ | ################ | ######## | #######
----+------------------+------------------+------------------+--------
If this alignment is imposed by the Vulkan implementation, then there’s nothing one can do about this. Simply put, that space is unusable. One also shouldn’t want to do anything about it, since these requirements have very good reasons, as described in further detail in previous sections. They prevent resources from aliasing so that performance is optimal.
It might seem strange that the application would want to cause internal fragmentation itself, but this is often a good trade-off to reduce or even completely eliminate external fragmentation. Internal fragmentation is very predictable, which makes it easier to deal with.
External fragmentation
With external fragmentation, what happens is that while the allocations might be using their own memory totally efficiently, the way they are arranged in relation to each other would prevent a new contiguous chunk of memory to be allocated even though there is enough free space left. That is why this fragmentation is said to be external to the allocations. Also, the allocations together with the fragments in-between add overhead both in terms of space and time to the allocator, because it needs to keep track of more things overall.
As an example, take these 4 allocations within some block, with the rest of the block assumed to be full:
+-----+-------------------+-------+-----------+-- - - --+
| | | | | |
| A | B | C | D | ••• |
| | | | | |
+-----+-------------------+-------+-----------+-- - - --+
The allocations were all done in order, and naturally there is no fragmentation at this point. Now if we free B and D, since these are done out of order, we will be left with holes between the other allocations, and we won’t be able to fit allocation E anywhere:
+-----+-------------------+-------+-----------+-- - - --+ +-------------------------+
| | | | | | ? | |
| A | | C | | ••• | <== | E |
| | | | | | | |
+-----+-------------------+-------+-----------+-- - - --+ +-------------------------+
So fine, we use a different block for E, and just use this block for allocations that fit:
+-----+---+-----+---------+-------+-----+-----+-- - - --+
| | | | | | | | |
| A | H | I | J | C | F | G | ••• |
| | | | | | | | |
+-----+---+-----+---------+-------+-----+-----+-- - - --+
Sure, now let’s free some shall we? And voilà, the problem just became much worse:
+-----+---+-----+---------+-------+-----+-----+-- - - --+
| | | | | | | | |
| A | | I | J | | F | | ••• |
| | | | | | | | |
+-----+---+-----+---------+-------+-----+-----+-- - - --+
Leakage
Memory leaks happen when allocations are kept alive past their shelf life. This most often
occurs because of cyclic references. If you have structures that have cycles, then make sure
you read the documentation for Arc
/Rc
carefully to avoid memory leaks. You can also
introduce memory leaks willingly by using mem::forget
or Box::leak
to name a few. In
all of these examples the memory can never be reclaimed, but that doesn’t have to be the case
for something to be considered a leak. Say for example you have a region which you
suballocate, and at some point you drop all the suballocations. When that happens, the region
can be returned (freed) to the next level up the hierarchy, or it can be reused by another
suballocator. But if you happen to keep alive just one suballocation for the duration of the
program for instance, then the whole region is also kept as it is for that time (and keep in
mind this bubbles up the hierarchy). Therefore, for the program, that memory might be a leak
depending on the allocator, because some allocators wouldn’t be able to reuse the entire rest
of the region. You must always consider the lifetime of your resources when choosing the
appropriate allocator.
Re-exports
pub use self::suballocator::AllocationType;
pub use self::suballocator::BuddyAllocator;
pub use self::suballocator::BumpAllocator;
pub use self::suballocator::FreeListAllocator;
pub use self::suballocator::Suballocation;
pub use self::suballocator::Suballocator;
pub use self::suballocator::SuballocatorError;
Modules
- Suballocators are used to divide a region into smaller suballocations.
Structs
- Parameters to create a new allocation using a memory allocator.
- An opaque handle identifying an allocation inside an allocator.
- Vulkan analog of std’s
Layout
, represented usingDeviceSize
s. - A generic implementation of a memory allocator.
- Parameters to create a new
GenericMemoryAllocator
. - An allocation made using a memory allocator.
- Describes what memory property flags are required, preferred and not preferred when picking a memory type index.
Enums
- Describes whether allocating
DeviceMemory
is desired. - Error that can be returned when creating an allocation using a memory allocator.
Traits
- General-purpose memory allocators which allocate from any memory type dynamically as needed.
Type Aliases
- Standard memory allocator intended as a global and general-purpose allocator.