In this tutorial, you will learn how to diagnose root causes for HTTP 504 gateway timeout errors in online OpenResty or Nginx servers. By capturing only the packets in the TCP connection leading to the 504 errors on the application level, OpenResty XRay minimizes the performance impact. This feature makes it ideal for production environments that are sensitive to performance overhead and latency. This is our powerful Smart Packet Capture technology.
First let’s check the access log of the application.
As you can see, there are 504 errors.
ps command to see the full command line for this application.
We can see it is the nginx binary executable provided by the official OpenResty repository.
Let’s use OpenResty XRay to analyze these 504 response errors in real time and figure out the root cause.
Open the OpenResty XRay web console in the web browser.
Make sure it is the right machine you are watching.
You can choose the right machine from the list below if the current one is not correct.
Go to the “Guided Analysis” page.
Here you can see different types of problems that you can diagnose.
Let’s select “Errors & exceptions”.
Click on “Next”.
Select the OpenResty application you saw earlier.
We do not specify any process here.
Select “Whole application”.
Make sure that the application type is right. Usually the default should be correct.
OpenResty XRay can analyze multiple language levels at the same time. We’ll keep both Lua and C/C++ selected.
We can also set the maximum analyzing time. We’ll leave it as 300 seconds, which is the default value.
Let’s start analyzing.
The system will keep performing different rounds of analysis. Now it’s executing the first round.
The first round is done and it’s on to the second one already. That’s enough for this case.
Let’s stop analyzing.
It shows that the system is generating a report.
We can see it automatically created an analysis report.
This is the type of problem we diagnose. It’s “Errors & Exceptions”.
The first issue is about the HTTP response status code 504.
The entire HTTP request took over 3 seconds.
The network packet with the largest delay relative to the previous one is this TCP packet “push with ACK”.
The previous network packet is the ACK packet.
The delay between the two packets is more than 3 seconds.
The network packet with the largest delay is sent by the upstream server to the current server. The endpoint address of the upstream server is shown here.
You can also find the endpoint address of the current server.
By capturing only the packets in the TCP connection leading to the 504 errors on the application level, OpenResty XRay minimizes the performance impact. This feature makes it ideal for production environments that are sensitive to performance overhead and latency. This is our powerful Smart Packet Capture technology.
Click “More” to see more details.
Here is the HTTP request URI.
The horizontal axis is the serial number for each packet, starting from 1 and increasing monotonically.
The vertical axis shows delays of the packets relative to their previous ones. The larger the value, the larger the delay.
The small squares represent the network packets sent out by the current server.
The small circles here represent network packets received by the current server.
Hover the mouse over this slowest packet. You can see the latency data and that the network packet was received by the current server.
Based on the analysis above, the system concluded that the current server triggered a timeout error while waiting for the network packet from the upstream server.
It gives the root cause for this error: a slow upstream server,
or a slow network link between the current server and the upstream server.
Or the current server has a too-short timeout setting for that upstream server.
It also gives detailed suggestions.
Here is another type of HTTP 504 error. The current server actively closed the current connection after the upstream server did not respond for a long time.
Here you can clearly see that the TCP packet with the FIN and ACK flags has the largest delay with respect to the previous packet. That means the current server actively closed the connection.
So the conclusion is, that the current server activated the timeout protection while waiting for the network packet from the upstream server, and thus gave up waiting and closed the current connection.
The system also gives the root cause for this error: a slow upstream server,
or a slow network link between the upstream and your current server.
Or maybe the timeout setting in the current server for that upstream server is too short.
OpenResty XRay can also monitor online processes automatically and show analysis reports. Go to the “Insights” page.
You can find the automatic reports on the “Insights” page for daily and weekly periods.
For this reason, you don’t have to use the “Guided Analysis” feature.
Guided analysis is useful for application development and demonstration purposes.
OpenResty XRay is a dynamic-tracing product that automatically analyzes your running applications to troubleshoot performance problems, behavioral issues, and security vulnerabilities with actionable suggestions. Under the hood, OpenResty XRay is powered by our Y language targeting various runtimes like Stap+, eBPF+, GDB, and ODB, depending on the contexts.
If you like this tutorial, please subscribe to this blog site and/or our YouTube channel. Thank you!
Yichun is one of the earliest advocates and leaders of “open-source technology”. He worked at many internationally renowned tech companies, such as Cloudflare, Yahoo!. He is a pioneer of “edge computing”, “dynamic tracing” and “machine coding”, with over 22 years of programming and 16 years of open source experience. Yichun is well-known in the open-source space as the project leader of OpenResty®, adopted by more than 40 million global website domains.
OpenResty Inc., the enterprise software start-up founded by Yichun in 2017, has customers from some of the biggest companies in the world. Its flagship product, OpenResty XRay, is a non-invasive profiling and troubleshooting tool that significantly enhances and utilizes dynamic tracing technology. And its OpenResty Edge product is a powerful distributed traffic management and private CDN software product.
As an avid open-source contributor, Yichun has contributed more than a million lines of code to numerous open-source projects, including Linux kernel, Nginx, LuaJIT, GDB, SystemTap, LLVM, Perl, etc. He has also authored more than 60 open-source software libraries.