Memory fragmentation is a common problem in computer systems though many clever algorithms have emerged to tackle it. Memory fragmentation wastes free memory blocks scattered in a memory region and these free blocks cannot be merged as a whole to serve future requests for large memory blocks or cannot be returned to the operating system for other use [1]. This could lead to a phenomenon of memory leaks since the total memory needed to fulfill more and more memory requests for large blocks would grow indefinitely. This kind of indefinite memory usage growth is usually not considered memory leaks in common perception since unused memory blocks are indeed released and marked free but they are just cannot be reused (for larger memory block requests) nor can be returned to the operating system.

The ngx_lua module’s lua_shared_dict zones do support inserting arbitrary user data items of an arbitrary length, from a tiny single number to a huge string. If care is not taken, it could easily lead to severe memory fragmentation inside the shm zone and wastes a lot of free memory. This article will present a few small and standalone examples to demonstrate this problem and various detailed behaviors. It will use the OpenResty XRay dynamic tracing platform to observe the memory fragmentation directly using vivid visualizations along the way. We will conclude the discussions by introducing the best practices of mitigating memory fragmentation inside shared memory zones.

As with almost all the technical articles in this blog site, we use our OpenResty XRay dynamic tracing product to analyze and visualize the internals of unmodified OpenResty or Nginx servers and applications. Because OpenResty XRay is a noninvasive analyzing platform, we don’t need to change anything in the target OpenResty or Nginx processes – no code injection needed and no special plugins or modules needed to be loaded into the target processes. This makes sure what we see inside the target processes through OpenResty XRay analyzers is exactly like when there is no observers at all.

If you are not already familiar with the memory allocation and usage inside OpenResty or Nginx’s shared memory zones, you are encouraged to refer to our previous blog post, “How OpenResty and Nginx Shared Memory Zones Consume RAM”.

An empty zone

Let’s start with an empty shared memory zone with no user data at all and check the slabs or memory blocks inside it so that we can understand the “baseline” :

1
2
3
4
5
6
7
8
9
10
11
12
13
14
master_process on;
worker_processes 1;

events {
worker_connections 1024;
}

http {
lua_shared_dict cats 40m;

server {
listen 8080;
}
}

Here we allocated a shared memory zone named cats of the total size of 40 MB. We never touched this cats zone in our configuration so it should be empty. But from our previous blog post “How OpenResty and Nginx Shared Memory Zones Consume RAM” we already know that an empty zone still has 160 pre-allocated slabs to serve as the meta data for the slab allocator itself. And the following graph for slab layout in the virtual memory space indeed confirms it:

Slabs layout for empty shm zone cats

As always, this graph is generated automatically by OpenResty XRay to analyze the real process. We can see there are 3 used slabs and more than a hundred free slabs.

Filling entries of similar sizes

Let’s add the following location = /t to our ongoing example nginx.conf:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
location = /t {
content_by_lua_block {
local cats = ngx.shared.cats
local i = 0
while true do
local ok, err = cats:safe_set(i, i)
if not ok then
if err == "no memory" then
break
end
ngx.say("failed to set: ", err)
return
end
i = i + 1
end
ngx.say("inserted ", i, " keys.")
}
}

Here we define a Lua content handler in our location = /t which inserts small key-value pairs into the cats zone until no free memory is available. Because we insert the numbers as both the key and value and the cats zone is small, the key-value pairs inserted should be of very similar sizes. After starting this Nginx server and then we query this /t location like this:

1
2
$ curl 'localhost:8080/t'
inserted 255 keys.

We can see that we can insert up to 255 such keys into the zone.

We can check the slabs’ layout inside that shm zone again:

Slab layout in the full shm zone 'cats'

If we compare this graph with the previous graph for the empty zone case, we can see all the newly added bigger slabs are in red (or in-use). Interestingly, the free slabs in the middle (in green) cannot be reused for the bigger slabs even though they are adjacent to each other. Apparently for these preserved free slabs are not automatically merged to form bigger free slabs.

Let’s see the size distribution for all these slabs via OpenResty XRay:

Slab size distribution for the full shm zone 'cats'

We can see that almost all the used slabs are of the 128 byte size.

Deleting odd-numbered keys

Now let’s try deleting the odd-numbered keys in the shm zone by adding the following Lua snippet right after our existing Lua content handler code:

1
2
3
for j = 1, i, 2 do
cats.delete(j)
end

After restarting the server and querying /t again, we get the following new slabs’ layout graph for the cats shm zone:

Slab layout in odd-key-deleted shm zone 'cats'

Now we have a lot of non-adjacent free blocks which cannot be merged together to serve bigger memory block requests in the future. Let’s try adding the following Lua code to the end of the Lua content handler to attempt adding a much bigger entry:

1
2
3
4
local ok, err = cats:safe_add("Jimmy", string.rep("a", 200))
if not ok then
ngx.say("failed to add a big entry: ", err)
end

Then we restart the server and query /t:

1
2
3
$ curl 'localhost:8080/t'
inserted 255 keys.
failed to add a big entry: no memory

As expected, the new big entry has a 200 byte string value, so the corresponding slab must be larger than the largest free slab available in the shm zone (which is 128 bytes as we saw earlier). So it is impossible to fulfill this memory block request without forcibly evicting used entries (like what the set() method would do when running out of memory in the zone).

Deleting the keys in the first half

Now let’s try something different. Instead of deleting the odd-number keys in the previous section, we delete the keys in the first half of the shm zone by adding the following Lua code:

1
2
3
for j = 0, i / 2 do
assert(cats:delete(j))
end

After restarting the server and querying /t, we got the following slabs’ layout in the virtual memory:

Slabs' layout for a zone with first half of the keys deleted

We can see now we have the adjacent free slabs automatically merged into 3 big free slabs near the middle of this shm zone. Actually they are 3 free memory pages of 4096 bytes each:

Free slab size distribution

These free pages can further form even larger slabs spanning multiple pages.

Now let’s try inserting the big entry which was failed to insert in the previous section:

1
2
3
4
5
6
local ok, err = cats:safe_add("Jimmy", string.rep("a", 200))
if not ok then
ngx.say("failed to add a big entry: ", err)
else
ngx.say("succeeded in inserting the big entry.")
end

This time it succeeds because we have plenty of contiguous free space to accommodate this key-value pair:

1
2
3
$ curl 'localhost:8080/t'
inserted 255 keys.
succeeded in inserting the big entry.

Now the new slabs’ layout indeed has this new entry:

Slabs in a zone with the big entry inserted

Please note the longest red block in the first half of graph. That is our “big entry”. The size distribution chart for used slabs can make it even clearer:

Size distribution for slabs with a big slab

We can see that our “big entry” is actually a slab of 512 bytes (including the key size, value size, and memory padding and address alignment overhead).

Mitigating Fragmentation

In previous sections, we already see scattered small free slabs can cause fragmentation problems in a shared memory zone which makes future memory block requests of larger size impossible to fulfill, even though the total sum of all these free slabs are even bigger. To allow reuse of these free slabs, we would recommend the following two ways:

  1. Always use similarly sized data entries so that there won’t be the problem of accommodating future larger memory block requests in the first place.
  2. Making deleted entries adjacent to each other so that they can merge into larger free slabs.

For 1), we can divide a single monolithic zone into several zones for different entry size groups[2]. For example, we can have a zone dedicated for data entries of the 0 ~ 128 byte size range only, and another for the 128 ~ 256 byte range.

For 2), we can group entries by their expiration time. Short-lived entries can live in a dedicated zone while long-lived entries live in another zone. This helps entries expire in a similar pace, increasing the chance of getting expired and eventually deleted at the same time.

Conclusion

Memory fragmentation inside OpenResty or Nginx’s shared memory zones can be quite hard to notice or troubleshoot. Fortunately, OpenResty XRay provides powerful observability and visualizations to see and diagnose such problems quickly. We presented several small examples to demonstrate the memory fragmentation issue and ways to work around it, using OpenResty XRay’s graphs and data to explain what is happening under the hood. We finally introduce best practices when working with shared memory zones in general configurations and programming based on OpenResty or Nginx.

Further Readings

About The Author

Yichun Zhang is the creator of the OpenResty® open source project. He is also the founder and CEO of the OpenResty Inc. company. He contributed a dozen open source Nginx 3rd-party modules, quite some Nginx and LuaJIT core patches, and designed the OpenResty XRay platform.

Translations

We provide the Chinese translation for this article on blog.openresty.com.cn. We also welcome interested readers to contribute translations in other natural languages as long as the full article is translated without any omissions. We thank them in advance.

We are hiring

We always welcome talented and enthusiastic engineers to join our team at OpenResty Inc. to explore various open source software’s internals and build powerful analyzers and visualizers for real world applications built atop the open source software. If you are interested, please send your resume to talents@openresty.com . Thank you!


  1. For OpenResty and Nginx’s shared memory zones, the allocated and accessed memory pages can never be returned to the operating system until all the current nginx processes quit. Nevertheless, released memory pages or memory slabs can still be reused for future memory requests inside the shm zone. In OpenResty or Nginx’s shared memory zones, memory fragmentation may also happen when the requested memory slabs or blocks are of varying sizes. Many standard Nginx directives involve with shared memory zones, like ssl_session_cache, proxy_cache_path, limit_req_zone, limit_conn_zone, and upstream’s zone. Nginx’s 3rd-party modules may use shared memory zones as well, like one of OpenResty’s core components, ngx_lua module. For those shm zones with equal-sized data entries, there won’t be any possibilities for memory fragmentation issues.

  2. Interestingly the Linux kernels’ buddy allocator and Memcached’s allocator uses a similar strategy.