This tutorial demonstrates both the right ways and wrong ways of benchmarking user Lua code in OpenResty.
First of all, make sure our CPU is always at its full speed.
echo performance | sudo tee /sys/devices/system/cpu/cpu*/cpufreq/scaling_governor
It usually takes the value
powersave by default and we need to set it to
The simplest way to time some Lua code is to use the
time command with the
time resty -e 'ngx.re.find("hello, world.", [[\w+\.]], "jo")'
But there is a catch. The
resty command itself has a startup and exiting overhead.
time resty -e ''
We can see there’s overhead of about 11 milliseconds on this machine.
Instead, we should use the
ngx.now Lua API function provided by OpenResty.
restydoc -s ngx.now
Let’s put our Lua code into a file named
./bench.lua for better readablility.
We make the following edits:
- First of all, we make sure the cached time inside nginx is up to date.
- And then we record the begin time which has millisecond precision.
- And then put our aforementioned regex matching call.
- And then we update our cached time again.
- Finally, output the elapsed time by doing a time subtraction.
- Let’s save the file.
Then run the
resty shell command.
It records about a latency of about 1 millisecond. But we will soon see it is very inaccurate.
The correct way is to make the following edits in the
- Put the call into a Lua function named
- And then call this function first for 100 times as a warmup. Now this
targetfunction should be JIT compiled after this loop is executed.
- And then inside the timed code region, we call it repeatedly for 10 million times.
- Finally we compute the average time.
local function target()
We now run this script again.
We can see that it is merely about 30 nanoseconds per call. So many many times faster than the previous result!
Actualy we can further make sure no dead GC objects hanging around before we time the code. Just insert the following line of code before the first
Here we force a full GC cycle before recording the begin time.
It does not help much with our example here, however. This is because our timed code does not create many GC objects anyway.
We can make the
target function faster by avoiding unnecessary Lua table lookup operations.
local re_find = ngx.re.find
But the difference may not be measurable here.
This is what I’d cover today. Hopefully you find it interesting.
If you like this tutorial, please subscribe to this blog site and our YouTube channel. Thank you!
This article and its associated video are both generated automatically from a simple screenplay file.
Yichun Zhang is the creator of the OpenResty® open source project. He is also the founder and CEO of the OpenResty Inc. company. He contributed a dozen open source Nginx 3rd-party modules, quite some Nginx and LuaJIT core patches, and designed the OpenResty XRay platform.
We provide the Chinese translation for this article on blog.openresty.com.cn. We also welcome interested readers to contribute translations in other natural languages as long as the full article is translated without any omissions. We thank them in advance.
We always welcome talented and enthusiastic engineers to join our team at OpenResty Inc.
to explore various open source software’s internals and build powerful analyzers and
visualizers for real world applications built atop the open source software. If you are
interested, please send your resume to
firstname.lastname@example.org . Thank you!