In this tutorial, you will get a step-by-step tour of how to use OpenResty XRay to identify the most CPU-intensive Go (golang) code paths of an online Prometheus server. These code paths are the ones that consume the most CPU time and severely affect Prometheus' performance.

Problem: high CPU usage

Let’s start by running the top command to check the CPU usage.

Screenshot

As shown, the Prometheus process consumes 160% of CPU cores.

Screenshot

Run the ps command to see the full command line for this process.

Screenshot

We can see it is the standard Prometheus binary executable shipped with the Linux distribution.

Screenshot

Use the guidede analysis feature of OpenResty XRay to spot the CPU-hottest Go code paths

Let’s use OpenResty XRay to check out this unmodified process. We can analyze it in real time and figure out what’s going on.

Open the OpenResty XRay Web console in the web browser.

Screenshot

Make sure it is the right machine you are watching.

Screenshot

You can choose the right machine from the list below if the current one is not correct.

Screenshot

Go to the “Guided Analysis” page.

Screenshot

Here you can see different types of problems that you can diagnose.

Screenshot

Let’s select “High CPU usage”.

Screenshot

Click on “Next”.

Screenshot

Select the Go application.

Screenshot

Select the process consuming 120% of the CPU resources. This is what we saw previously in top.

Screenshot

Make sure that the application type is right.

Screenshot

Usually the default should be correct.

The language level is just “Go”.

Screenshot

We can also set the maximum analyzing time. We’ll leave it as 300 seconds, which is the default value.

Screenshot

Let’s start analyzing.

Screenshot

The system will keep performing different rounds of analysis. Now it’s executing the first round.

Screenshot

The first 2 rounds are done and it’s on to the third one already. That’s enough for this case.

Screenshot

Let’s stop analyzing.

Screenshot

It shows that the system is generating a report.

Screenshot

We can see it automatically created an analysis report.

Screenshot

This is the type of problem we diagnose. It’s “CPU”.

Screenshot

As you can see, the Go garbage collection takes up more than 99% of the CPU time.

Screenshot

For example, this Go code path where garbage collection is performed takes up to 21% of the CPU time.

Screenshot

The Go runtime function scanobject belongs to the garbage collector. It scans GC objects in the heap and marks reachable ones.

Screenshot

The function gcDrain is used to drain the work queue of GC objects that need to be marked.

Screenshot

Fast allocation of many GC objects incurs high GC overhead. So the report shows the hottest Go code paths allocating the most objects with the highest speed.

Screenshot

Now let’s check the #1 Go code path that allocates the most GC objects.

Screenshot

The function loadWAL is loading data from the Write-Ahead Log in Prometheus.

Screenshot

The function Series decodes time series data from the buffer and appends to the given slice.

Screenshot

The function slicebytetostring converts a slice of bytes to a string.

Screenshot

Click “More” to see more details.

Screenshot

This code path was automatically inferred from this Go GC object allocation flame graph.

Screenshot

Below are more explanations and suggestions about the issue.

Screenshot

It mentions the function loadWAL.

Screenshot

and this function is loading data from the Write-Ahead Log.

Screenshot

It also mentions the function Series,

Screenshot

and the function slicebytetostring.

Screenshot

Let’s go back to the code path. Hover the mouse over the green box for the function loadWAL.

Screenshot

We can see its source file and the full path for this file in the tooltip.

Screenshot

The source line number is 141.

Screenshot

Click the icon to copy the full Go source file path for this function.

Screenshot

Use the vim editor to open the source file. And look at the golang code in this file. You can use any editors you like.

Screenshot

Go to line 141, as OpenResty XRay suggested.

Screenshot

The function dec.Series is to decode a set of time series from a record.

Screenshot

On the status bar, you can see that this source line is inside the loadWAL function as shown in the report.

Screenshot

The TSDB of Prometheus creates a memory series to manage the most recent data. The number of GC objects newly allocated by this code path exceeds 19% of the total new allocations.

Screenshot

And we can see here that dynamically allocating new GC objects takes up almost 11% of the CPU time. So such allocations not only incur high garbage collection overhead but are also expensive themselves.

Screenshot

Automatic analysis and reports

OpenResty XRay can also monitor online processes automatically and generate analysis reports.

Go to the “Insights” page.

Screenshot

You can find the automatic reports on the “Insights” page for daily and weekly periods.

Screenshot

For this reason, you don’t have to use the Guided Analysis feature. Guided analysis is useful for application development and demonstration purposes.

Screenshot

What is OpenResty XRay

OpenResty XRay is a dynamic-tracing product that automatically analyzes your running applications to troubleshoot performance problems, behavioral issues, and security vulnerabilities with actionable suggestions. Under the hood, OpenResty XRay is powered by our Y language targeting various runtimes like Stap+, eBPF+, GDB, and ODB, depending on the contexts.

If you like this tutorial, please subscribe to this blog site and/or our YouTube channel. Thank you!

About The Author

Yichun Zhang (Github handle: agentzh), is the original creator of the OpenResty® open-source project and the CEO of OpenResty Inc..

Yichun is one of the earliest advocates and leaders of “open-source technology”. He worked at many internationally renowned tech companies, such as Cloudflare, Yahoo!. He is a pioneer of “edge computing”, “dynamic tracing” and “machine coding”, with over 22 years of programming and 16 years of open source experience. Yichun is well-known in the open-source space as the project leader of OpenResty®, adopted by more than 40 million global website domains.

OpenResty Inc., the enterprise software start-up founded by Yichun in 2017, has customers from some of the biggest companies in the world. Its flagship product, OpenResty XRay, is a non-invasive profiling and troubleshooting tool that significantly enhances and utilizes dynamic tracing technology. And its OpenResty Edge product is a powerful distributed traffic management and private CDN software product.

As an avid open-source contributor, Yichun has contributed more than a million lines of code to numerous open-source projects, including Linux kernel, Nginx, LuaJIT, GDB, SystemTap, LLVM, Perl, etc. He has also authored more than 60 open-source software libraries.