How CPU time is spent inside Rust’s Sled library (using OpenResty XRay)
In this tutorial, we will walk through how to use OpenResty XRay to quantitatively analyze the CPU time consumption in the Rust Sled library. We will show the Rust code paths that consume the most CPU time. These hot code paths are obtained by OpenResty XRay automatically analyzing and interpreting the Rust language-level CPU flame graphs.
Problem: high CPU usage
Sled is an embedded KV database written in Rust.
We have a in-house cache service based on Sled.
The CPU usage is very high, more than 100%.
Use the guidede analysis feature of OpenResty XRay to spot the most CPU-intensive Rust code paths
Let’s use OpenResty XRay to check out this unmodified process. Open the OpenResty XRay web console in the web browser.
Let’s use OpenResty XRay to check this unmodified process. You can analyze it in real-time and figure out what is happening. Let’s open the OpenResty XRay web console in the web browser.
Make sure it is the right machine you are watching.
You can choose the right machine from the list below if the current one is not correct.
Go to the “Guided Analysis” page.
Here you can see different types of problems that you can diagnose.
Let’s select the “High CPU Usage”.
Click on “Next”.
Select the previous Sled application, the type of application is Rust.
Select the process that consumes more than 100% of the CPU resources. This is what we saw previously in top
.
Make sure that the application type is right. Usually the default should be correct.
The language level here is just “Rust”.
We can also set the maximum analyzing time. We’ll leave it as 300 seconds, which is the default value.
Let’s start analyzing.
The system will keep performing different rounds of analysis. Now it’s executing the first round.
The first round is done and it’s already on the second one. That’s enough for this case.
Let’s stop analyzing.
It automatically generated an analysis report.
This is the type of problem we are going to diagnose. It’s CPU.
This is the #1 hottest Rust-land code path for the CPU time.
The first function sled::tree::Tree::insert
is used in Sled for data insertion.
Click “More” to see details.
The hot code path was automatically inferred from this Rust-land CPU flame graph.
Below are more detailed explanations and suggestions regarding the current issue. It mentions the insert
function we saw earlier.
Click the icon to enlarge the flame graph.
Clicking the insert
function to see more details.
On the left side, we can see the view_for_key
function takes a large portion. This is the function in Sled library that gets a snapshot view for a given key.
On the right, pagecache is a component of Sled used to manage data on a page-by-page basis. Writes are first stored in the pagecache’s memory page. Then persisted by flushing the disk when the batch is full.
Click to zoom in.
This is the realloc function in Glibc, which is a memory allocation function. We can see that libc’s memory allocation functions are hotter.
On the terminal, use the find
command to locate Sled library source code directory in the cargo cache.
Copy the directory, to get into the Sled source code directory.
Let’s go back to the original hot code path. Hover the mouse over the green box for the insert
function. You can see the source file of this function in the tooltip.
The source line number is 164.
Click the icon to copy the source file path.
Use the vim editor to open the source file. Paste the file path we just copied. You can use any editors you like.
Go to line 164, as OpenResty XRay suggested.
This line is inside the insert
function.
Next check the second code path. The #2 hottest code path consumes almost 40% of the CPU time.
The top function call get_inner
, is the function in Sled that looks up the data.
The get
function is the interface that the library exposes to the user to get data, and it calls get_inner
internally.
Click “More” to see details.
Enlarge the flame graph to see details of get_inner
function calls.
Zoom in on the get_inner
.
As shown, the get_inner
function is taken up mostly by the view_for_key
function mentioned earlier.
The sled::lru::Lru::accessed
function is used in Rust’s Sled library, to update the accessed state of an item in the LRU cache. And return a list of page IDs that need to be evicted.
Automatic analysis and reports
OpenResty XRay can also monitor online processes automatically and show analysis reports.
Go to the “Insights” page.
You can find the reports in the Insights page for daily and weekly periods. For this reason, you don’t have to use the “Guided Analysis” feature.
Though “Guided analysis” is useful for application development and demonstration purposes.
What is OpenResty XRay
OpenResty XRay is a dynamic-tracing product that automatically analyzes your running applications to troubleshoot performance problems, behavioral issues, and security vulnerabilities with actionable suggestions. Under the hood, OpenResty XRay is powered by our Y language targeting various runtimes like Stap+, eBPF+, GDB, and ODB, depending on the contexts.
If you like this tutorial, please subscribe to this blog site and/or our YouTube channel. Thank you!
About The Author
Yichun Zhang (Github handle: agentzh), is the original creator of the OpenResty® open-source project and the CEO of OpenResty Inc..
Yichun is one of the earliest advocates and leaders of “open-source technology”. He worked at many internationally renowned tech companies, such as Cloudflare, Yahoo!. He is a pioneer of “edge computing”, “dynamic tracing” and “machine coding”, with over 22 years of programming and 16 years of open source experience. Yichun is well-known in the open-source space as the project leader of OpenResty®, adopted by more than 40 million global website domains.
OpenResty Inc., the enterprise software start-up founded by Yichun in 2017, has customers from some of the biggest companies in the world. Its flagship product, OpenResty XRay, is a non-invasive profiling and troubleshooting tool that significantly enhances and utilizes dynamic tracing technology. And its OpenResty Edge product is a powerful distributed traffic management and private CDN software product.
As an avid open-source contributor, Yichun has contributed more than a million lines of code to numerous open-source projects, including Linux kernel, Nginx, LuaJIT, GDB, SystemTap, LLVM, Perl, etc. He has also authored more than 60 open-source software libraries.