OpenResty and Nginx servers are often configured with shared memory zones which can hold data that is shared among all their worker processes. For example, Nginx’s standard modules ngx_http_limit_req and ngx_http_limit_conn use shared memory zones to hold state data to limit the client request rate and client requests’ concurrency level across all the worker processes. OpenResty’s ngx_lua module provides lua_shared_dict to provide shared memory dictionary data storage for the user Lua code.

In this article, we will explore how these shared memory zones consume physical memory (or RAM) by several minimal and self-contained examples. We will also examine how the share memory utilization affects system-level process memory metrics like VSZ and RSS as seen in the output of system utilities like ps. And finally, we will discuss the “fake memory leak” issues caused by the on-demand usage nature of the shared memory zones as well as the effect of Nginx’s HUP reload operation.

As with almost all the technical articles in this blog site, we use our OpenResty XRay dynamic tracing product to analyze and visualize the internals of unmodified OpenResty or Nginx servers and applications. Because OpenResty XRay is a noninvasive analyzing platform, we don’t need to change anything in the target OpenResty or Nginx processes – no code injection needed and no special plugins or modules needed to be loaded into the target processes. This makes sure what we see inside the target processes through OpenResty XRay analyzers is exactly like when there is no observers at all.

We would like to use ngx_lua module’s lua_shared_dict in most of the examples below since it is programmable by custom Lua code. The behaviors and issues we demonstrate in these examples also apply well to any other shared memory zones found in all standard Nginx modules and 3rd-party ones.

Slabs and pages

Nginx and its modules usually use the slab allocator implemented by the Nginx core to manage the memory storage inside a shared memory zone. The slab allocator is designed specifically for allocating and deallocating small memory chunks inside a fixed-size memory region. On the top of the slabs, the shared memory zones may introduce higher level data structures like red-black trees and linked lists. A slab can be as small as a few bytes and can also be as large as spanning multiple memory pages.

The operating system manages the shared memory (or any other kinds of memory) by pages. On x86_64 Linux, the default page size is usually 4 KB but it can be different depending on the architecture and Linux kernel configurations. For example, some Aarch64 Linux systems have a page size of 64 KB.

We shall see detailed memory page level and slab level statistics for shared memory zones in real OpenResty and Nginx processes.

What is allocated is not what is paid for

When compared with disks, physical memory (or RAM) is always a kind of very precious resource. Most of the modern operating systems employ demand paging as an optimization trick to reduce the stress of user applications on the RAM. Basically, when you allocate a large chunk of memory, the operating system kernel would defer the actual assignment of the RAM resources (or physical memory pages) to the point where these memory pages’ content is actually used. For example, if the user process allocates 10 pages of memory and only ever uses 3 pages, then the operating system may only assigns these 3 pages to the RAM device. The same applies to the shared memory zones allocated in an Nginx or OpenResty application. The user may configure huge shared memory zones in the nginx.conf file but she may notice that the server takes almost no extra memory immediately after starting up the server because very few of the shared memory pages are actually used.

Empty zones

Let’s consider the following sample nginx.conf file which allocates an empty shard memory zone which is never used:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
master_process on;
worker_processes 2;

events {
worker_connections 1024;
}

http {
lua_shared_dict dogs 100m;

server {
listen 8080;

location = /t {
return 200 "hello world\n";
}
}
}

Here we configure a 100 MB shared memory zone named dogs via the lua_shared_dict directory. And 2 worker processes are configured for this server. Please note that we never touch this dogs zone in this configuration, therefore the zone should be empty.

Let’s start this server like below:

1
2
3
4
5
mkdir ~/work/
cd ~/work/
mkdir logs/ conf/
vim conf/nginx.conf # paste the nginx.conf sample above here
/usr/local/openresty/nginx/sbin/nginx -p $PWD/

We can check if the nginx processes are already running like this:

1
2
3
4
5
$ ps aux|head -n1; ps aux|grep nginx
USER PID %CPU %MEM VSZ RSS TTY STAT START TIME COMMAND
agentzh 9359 0.0 0.0 137508 1576 ? Ss 09:10 0:00 nginx: master process /usr/local/openresty/nginx/sbin/nginx -p /home/agentzh/work/
agentzh 9360 0.0 0.0 137968 1924 ? S 09:10 0:00 nginx: worker process
agentzh 9361 0.0 0.0 137968 1920 ? S 09:10 0:00 nginx: worker process

The worker processes take similarly sized memory. Let’s focus on the worker process of the PID 9360 from now on. In OpenResty XRay console’s web UI, we can see this process takes for total 134.73 MB of virtual memory and 1.88 MB of resident memory (identical to the reporting of the ps command shown above):

Virtual memory usage breakdown for an empty zone

As we already discussed in the other article, How OpenResty and Nginx Allocate Memory, what really matters is the resident memory usage which actually maps hardware resources to the corresponding memory pages (like RAM[1]). Therefore, very little memory is actually assigned with hardware resources, just 1.88MB in total. The 100 MB shared memory zone we configured above definitely takes a very small part in this resident memory portion (which we will see in details below). The 100 MB shared memory zone already adds the 100 MB size to the virtual memory size of this process, however. The operating system does preserve the virtual memory address space for this shared memory zone, but that is just bookkeeping records which does not take up any RAM or other hardware resources at all.

Empty is not empty

To check if this empty shared memory zone takes up any resident (or physical) memory at all, we can refer to the "Application-Level Memory Usage Breakdown" chart for this process below:

Application-Level Memory Usage Breakdown

Interesting we see a nonzero Nginx Shm Loaded component in this pie chart. It is a tiny portion, just 612 KB. So an empt shared memory zone is not completely empty. This is because Nginx always stores some meta data for book-keeping purposes into any newly initialized shared memory zones. Such meta data is used by the Nginx’s slab allocator.

Loaded and unloaded pages

We can check out how many memory pages are actually used (or loaded) inside all the shared memory zones by looking at the following chart produced automatically by OpenResty XRay:

Loaded and unloaded memory pages in shared memory zones

We can see that there are 608 KB of memory is loaded (or actually used) in the dogs zone, while there is special ngx_accept_mutex_ptr zone which is automatically allocated by the Nginx core for the accept_mutex feature. When we add these two sizes together, we get 612 KB, which is exactly the Nginx Shm Loaded size shown in the pie chart above. As we mentioned above, the 608 KB memory used by the dogs zone is actually meta data used by the slab allocator.

The unloaded memory pages are just preserved virtual memory address space that has never been touched (or used).

A word on process page tables

One complication we haven’t mentioned yet is that each nginx worker process has its own page table used by the CPU hardware or the operating system kernel when looking up a virtual memory page. For this reason, each process may have different sets of loaded pages for exactly the same shared memory zone because each process may have touched different sets of memory pages in its own execution history. To simplify the analysis here, OpenResty XRay always shows all the memory pages that are loaded by any of the worker processes even if the current target worker process does not have touched some of those pages. For this reason, the total size of loaded pages here may (slightly) exceed the actual portion in the resident memory size of the target process.

Free and used slabs

As we have discussed above, Nginx usually manages the shared memory zone by slabs instead of memory pages. We can directly see the statistics of the used and free (or unused) slabs inside a particular shared memory zone through OpenResty XRay:

Free and used slabs in zone dogs

As expected, most of the slabs are free or unused for our example. Note that the size numbers are actually much smaller than the memory page level statistics shown in the previous section. This is because we are now on a higher abstraction level, the slabs level, excluding most of the slab allocator’s own memory overhead and the memory page padding overhead.

We can further observe the size distribution of all the individual slabs in this dogs zone through OpenResty XRay:

Used slab size distribution for an empty zone

Free slab size distribution

We can see that for this empty zone, there are still 3 used slabs and 157 free slabs. Or for total 3 + 157 = 160 slabs. Please keep this number in mind when we later compare this with the same dogs zone with some user data inserted.

Zones with user data

Now let’s modify our previous example by inserting some data upon Nginx server startup. Basically, we just need to add the following init_by_lua_block directive to the nginx.conf file’s http {} configuration block:

1
2
3
4
5
init_by_lua_block {
for i = 1, 300000 do
ngx.shared.dogs:set("key" .. i, i)
end
}

Here we initialize our dogs shared memory zone by inserting 300,000 key-value pairs into it during the server startup.

Then let’s restart the server with the following shell commands:

1
2
kill -QUIT `cat logs/nginx.pid`
/usr/local/openresty/nginx/sbin/nginx -p $PWD/

The new Nginx processes now look like this:

1
2
3
4
5
$ ps aux|head -n1; ps aux|grep nginx
USER PID %CPU %MEM VSZ RSS TTY STAT START TIME COMMAND
agentzh 29733 0.0 0.0 137508 1420 ? Ss 13:50 0:00 nginx: master process /usr/local/openresty/nginx/sbin/nginx -p /home/agentzh/work/
agentzh 29734 32.0 0.5 138544 41168 ? S 13:50 0:00 nginx: worker process
agentzh 29735 32.0 0.5 138544 41044 ? S 13:50 0:00 nginx: worker process

Virtual memory and resident memory

For the Nginx worker process 29735, OpenResty XRay gives the following pie chart:

Virtual memory usage breakdown for a non-empty zone

Apparently now the resident memory is significantly larger than the previous empty zone case and it also takes a much larger portion in the total virtual memory size (29.6%). The virtual memory size is just slightly larger than before (135.30 MB vs 134.73 MB). Because the shared memory zones’ sizes stay the same, they contribute nothing to the increased virtual memory size. It is just due to the newly added Lua code via the init_by_lua_block directive (this small addition also contributes to the resident memory size).

The application-level memory usage breakdown shows that the Nginx shared memory zone’s loaded memory takes most of the resident memory:

Loaded and unloaded pages in zone dogs

Loaded and unloaded pages

Now we have many more loaded memory pages and far less unloaded ones inside this dogs shared memory zone:

Loaded and unloaded pages for zone dogs

Free and used slabs

This time we have 300,000 more used slabs (in addition to the 3 pre-allocated slabs in an empty zone):

Used slabs for non-empty zone dogs

Apparently each key-value pair in the lua_shared_dict zone corresponds to a single slab.

The number of free slabs are exactly the same as in the empty zone case, i.e., 157 slabs:

Free slabs for a non-empty zone dogs

Fake Memory Leaks

As we demonstrated above, the shared memory zones will not consume any RAM resources until more and more of their memory pages get accessed by the applications. For this reason, it may seem to the user that the resident memory usage of nginx worker processes keeps growing infinitely, especially right after the process is started. It may give a false alarm of memory leaks. The following chart shows such an example:

process memory growing

By looking at the application-level memory breakdown chart produced by OpenResty XRay, we can clearly see that the Nginx shared memory zones are taking most of the resident memory here:

Memory usage breakdown for huge shm zones

Such memory growth is temporary and will stop once the shared memory zones are all filled up. But this also poses a potential risk when the shared memory zones are configured too large, larger than the current operating system can ever accommodate. For this reason, it is always a good idea to keep an eye on page-level memory consumption graphs like below:

Loaded and unloaded memory pages in shared memory zones

The blue portions may eventually be used up by the process (i.e., turning red) and put real impact on the current system.

HUP reload

Nginx does support receiving the HUP signal to reload the server configuration without quitting the master process (the worker processes would still be exited gracefully and relaunched, however). Usually the Nginx shared memory zones would automatically inherit the existing data after the HUP reload operation. So any previously assigned physical memory pages for those accessed shared memory data will stay. Thus any attempts to use HUP reload to release up shared memory zones’ existing resident memory pages would fail. The user should use full restart or Nginx’s binary upgrade operation instead.

Nevertheless, it is up to the Nginx modules implementing the shared memory zones to decide whether to keep the data during a HUP reload. So there might be exceptions.

Conclusion

We have already explained that Nginx’s shared memory zones may take much less physical memory resources than the size configured in the nginx.conf file. Thanks to the demand-paging feature of modern operating systems. We demonstrated that empty shared memory zones may still utilize some memory pages and slabs to store the slab allocator’s meta data. By means of OpenResty XRay analyzers, we can easily examine exactly how much memory is actually used or loaded by the shared memory zones inside any running nginx worker processes at real time, both on the memory page level and the slab level.

On the other hand, the demand-paging optimization may also produce steady memory usage growth for a period of time, which is not really memory leaks but may still impose risks. And we covered that Nginx’s HUP reload operation usually do not clear existing data in shared memory zones.

In future articles on this blog site, we will continue looking at high level data structures used in shared memory zones like red-black trees and queues, and will also analyze and mitigate memory fragmentation issues inside shared memory zones.

Further Readings

About The Author

Yichun Zhang is the creator of the OpenResty® open source project. He is also the founder and CEO of the OpenResty Inc. company. He contributed a dozen open source Nginx 3rd-party modules, quite some Nginx and LuaJIT core patches, and designed the OpenResty XRay platform.

Translations

We provide the Chinese translation for this article on blog.openresty.com.cn. We also welcome interested readers to contribute translations in other natural languages as long as the full article is translated without any omissions. We thank them in advance.

We are hiring

We always welcome talented and enthusiastic engineers to join our team at OpenResty Inc. to explore various open source software’s internals and build powerful analyzers and visualizers for real world applications built atop the open source software. If you are interested, please send your resume to talents@openresty.com . Thank you!


  1. When swapping happens, some residential memory pages would be saved and mapped to disk devices.