In this tutorial, I will demonstrate how the CPU time is spent qualitatively inside Go’s etcd server with OpenResty XRay. I will show the most CPU-intensive Go code paths in it. OpenResty XRay will automatically analyze the Go (golang) language level CPU flame graph.

Problem: high CPU usage

Let’s start by running the top command to check the CPU usage.

Screenshot

As shown, the etcd process consumes more than 70% of a CPU core.

Screenshot

Run the ps command to see the full command line for this process. We can see it is a standard etcd binary executable without any changes.

Screenshot

Use the guidede analysis feature of OpenResty XRay to spot the hottest Go code paths

Let’s use OpenResty XRay to check out this unmodified process. We can analyze it in real time and figure out what’s going on.

Open the OpenResty XRay web console in the web browser.

Screenshot

Make sure it is the right machine you are watching.

Screenshot

You can choose the right machine from the list below if the current one is not correct.

Screenshot

Go to the “Guided Analysis” page.

Screenshot

Here you can see different types of problems that you can diagnose.

Screenshot

Let’s select “High CPU usage”.

Screenshot

Click on “Next”.

Screenshot

Select the Go application for etcd.

Screenshot

Select the process that consumes more than 70% of the CPU resources. This is what we saw previously in top.

Screenshot

Make sure that the application type is right. Usually the default should be correct.

Screenshot

The language level here is just “Go”.

Screenshot

We can set the maximum analyzing time. We’ll leave it as 300 seconds, which is the default value.

Screenshot

Let’s start analyzing.

Screenshot

The system will keep performing different rounds of analysis. Now it’s executing the first round.

Screenshot

The first round is done and it’s on to the second one already. That’s enough for this case.

Screenshot

Let’s stop analyzing now.

Screenshot

It shows that the system is generating a report for the current analysis.

Screenshot

We can see it automatically created a report.

Screenshot

This is the type of problem we are going to diagnose. It’s CPU.

Screenshot

This is the #1 hottest Go-land code path for the CPU time.

Screenshot

The processUnaryRPC function is in the Go gRPC library. It’s responsible for handling the simplest type of gRPC messages.

Screenshot

This function is called by the handleStream function.

Screenshot

Click “More” to see details about this code path.

Screenshot

The code path was automatically derived from this Go-land CPU flame graph.

Screenshot

Click the icon to enlarge the flame graph.

Screenshot

Continue to zoom in.

Screenshot

The function _KV_Range_Handler gets the keys in the range from the key-value store.

Screenshot

The function _KV_Put_Handler puts the given key into the key-value store.

Screenshot

The Range function is used to query the key-value data stored in etcd by range.

Screenshot

It calls runtime.newobject to create a large number of golang GC objects.

Screenshot

The function runtime.newstack has a high CPU overhead when writing data to etcd. This function is an internal function of the Go language runtime. It is used to create a new stack space for a goroutine.

Screenshot

Below are more detailed explanations and suggestions regarding the current issue.

Screenshot

It mentions the function processUnaryRPC.

Screenshot

And it processes the Unary RPC.

Screenshot

Let’s go back to the code path. Hover the mouse over the green box for the first function.

Screenshot

We can see the source file of this function. And its full path for the server.go file in the tooltip.

Screenshot

The source line number is 1024.

Screenshot

Click the icon to copy the full Go source file path for this function.

Screenshot

Use the find command to find the source file.

Screenshot

Paste the file path we just copied.

Screenshot

Copy the full file path. Use the vim editor to open the source file. And look at the golang code in this file. You can use any editors you like.

Screenshot

Go to line 1024, as OpenResty XRay suggested.

Screenshot

The function md.Handler calls different message handlers depending on the type of gRPC message. The _KV_Range_Handler and _KV_Put_Handler we saw earlier are two examples of such md.Handler callbacks.

Screenshot

On the status bar you can see that this source line is inside the processUnaryRPC function.

Screenshot

The second hottest code path consumes about 12% of the CPU time.

Screenshot

From this function, we know it is writing to a network socket.

Screenshot

This is invoking the write system call.

Screenshot

This function sends response data to the network sockets via the HTTP/2 protocol. 

Screenshot

The third hottest code path consumes about 11% of the CPU time.

Screenshot

Here, runtime.mcall mainly executes the scheduling of goroutine.

Screenshot

Now we see the fourth hottest Go-code paths.

Screenshot

This is for logging every unary gRPC call. And we might want to skip such logging to save CPU time.

Screenshot

Automatic analysis and reports

OpenResty XRay can also monitor online processes automatically and show analysis reports. Go to the “Insights” page.

Screenshot

You can find the reports for daily and weekly periods.

Screenshot

So you don’t have to use the “Guided Analysis” feature. Guided analysis is useful for application development and demonstration purposes.

Screenshot

What is OpenResty XRay

OpenResty XRay is a dynamic-tracing product that automatically analyzes your running applications to troubleshoot performance problems, behavioral issues, and security vulnerabilities with actionable suggestions. Under the hood, OpenResty XRay is powered by our Y language targeting various runtimes like Stap+, eBPF+, GDB, and ODB, depending on the contexts.

If you like this tutorial, please subscribe to this blog site and/or our YouTube channel. Thank you!

About The Author

Yichun Zhang (Github handle: agentzh), is the original creator of the OpenResty® open-source project and the CEO of OpenResty Inc..

Yichun is one of the earliest advocates and leaders of “open-source technology”. He worked at many internationally renowned tech companies, such as Cloudflare, Yahoo!. He is a pioneer of “edge computing”, “dynamic tracing” and “machine coding”, with over 22 years of programming and 16 years of open source experience. Yichun is well-known in the open-source space as the project leader of OpenResty®, adopted by more than 40 million global website domains.

OpenResty Inc., the enterprise software start-up founded by Yichun in 2017, has customers from some of the biggest companies in the world. Its flagship product, OpenResty XRay, is a non-invasive profiling and troubleshooting tool that significantly enhances and utilizes dynamic tracing technology. And its OpenResty Edge product is a powerful distributed traffic management and private CDN software product.

As an avid open-source contributor, Yichun has contributed more than a million lines of code to numerous open-source projects, including Linux kernel, Nginx, LuaJIT, GDB, SystemTap, LLVM, Perl, etc. He has also authored more than 60 open-source software libraries.