Real-Time JS/CSS/HTML Minification at 120 MB/s — Right in Your Nginx/OpenResty Gateway
If you’re running an OpenResty gateway in front of systems you don’t control—legacy apps, third-party tenants, upstream services with no build pipeline you can touch — openresty-minifiers is built for exactly that gap. It’s a streaming JS/CSS/HTML minifier that runs as an Nginx output filter, hitting 120+ MB/s on a single core with constant memory overhead.
Within your CI/CD pipeline, minification has likely become a mundane part of the process.
Whether it’s TerserPlugin in a Webpack configuration, Vite’s default esbuild minification, or cssnano running silently in a PostCSS toolchain—these tools are so mature you barely notice they exist. You configure them once, and they run thousands of times without a hitch. The engineering world solved this problem a decade ago.
But to understand why that matters here, it helps to start with what you probably already have.
When You Can’t Touch the Build Pipeline
Imagine three scenarios:
Scenario One: Reverse-Proxying a Legacy System. You inherit a decade-old Java monolith. Its front-end assets are served completely uncompressed, but you have no permission to modify its build process—you don’t even have the source code. Your OpenResty gateway is the only place you can step in.
Scenario Two: A Multi-Tenant SaaS Gateway. Your platform proxies traffic for hundreds of tenants, each responsible for maintaining their own application. You need to optimize egress bandwidth centrally at the gateway layer, but you can’t require every tenant to overhaul their build toolchain.
Scenario Three: Transparent Optimization at the Edge. You’re running OpenResty on your CDN PoP (Point of Presence) nodes. You want to perform real-time compression on all static assets that flow through, completely independent of any upstream configuration changes.
All three scenarios share one common characteristic: minification must happen at runtime, processing the response body as a stream within the Nginx filter layer.
This constraint forces us to revisit a supposedly solved problem from the ground up.
Why a Regex Won’t Cut It
Most engineers have the same gut reaction when they first encounter this requirement: “Isn’t this just string manipulation? Can’t we just use regex to replace comments and extra whitespace?”
This intuition is not only wrong—it’s a classic pitfall.
The Slash Ambiguity Problem and Why It Matters
For example, consider the following snippet of JavaScript code:
var result = a / b / c;
var regex = /pattern/g;
var str = "remove // this comment? no";
// this is a real comment
You need to remove the actual comment (on line four), but you must not alter the // this comment? within a string literal, touch the slashes in a regex literal like /pattern/g, or misinterpret a division operation like a / b / c as the beginning of a comment.
This is the well-known slash ambiguity problem in JavaScript lexical analysis: the / character can be a division operator, the start of a regular expression literal, or the start of a comment, depending on the context. It’s impossible to differentiate based on local features alone; maintaining the complete parser state is essential.
CSS faces similar challenges: the content inside a url() function cannot be treated as plain text, and whitespace within calc() is semantically significant (e.g., the spaces in calc(100% - 20px) cannot be removed). HTML is even more complex—<script> and <style> tags require switching to entirely different parsing modes, which are precisely the parsing modes for JavaScript and CSS, respectively.
Any minifier that fails to account for these syntactic contexts will inevitably introduce hard-to-diagnose bugs in production.
Your Data Arrives in Chunks. Your Parser Can’t Wait.
A build-time minifier operates on complete files. It can parse a full Abstract Syntax Tree (AST) and then perform transformations based on it. The entire process is straightforward and deterministic.
In the Nginx filter layer, however, you work with fragments of a data stream. An HTTP response body is split into a buffer chain and passed to the filter module. The size of each buffer is determined by the upstream and the kernel, making it unpredictable from the filter’s perspective.
In other words, a comment might span two buffers: one buffer might end with /*, and the next one might begin with */. A string literal could be in an “unclosed” state at a buffer boundary. A </script> tag could also be split into two chunks.
A naive implementation will fail silently at these boundaries—it won’t report errors but will produce a JS file that occasionally fails to parse in the browser.
Buffering the Whole Response Is Not an Option
The most straightforward approach is to buffer the entire response body into memory and then process it with a mature, offline minifier.
The problem is that this violates Nginx’s streaming model and introduces significant engineering risks:
- Uncontrolled memory growth: A 2MB JS file requires at least 2MB of additional heap memory. Multiplied by the number of concurrent connections, this can amount to tens of gigabytes.
- Increased TTFB: The entire response body must be received before processing can begin, which directly increases the client’s Time To First Byte (TTFB).
- A single large file can crash a worker: An unbounded buffer is, in itself, a Denial-of-Service (DoS) vector.
A viable solution must be able to perform incremental processing within a fixed-size buffer (e.g., 8KB) and guarantee the correctness of the result—even when syntactic structures span across buffer boundaries.
Therefore, what you actually need is not a minifier, but a streaming lexical analyzer based on a finite-state machine. It must be capable of correctly maintaining parsing state across buffers using O(1) memory. This becomes a completely different engineering challenge.
Three Approaches That Almost Work
Without a dedicated tool, engineering teams often attempt the following approaches:
Approach 1: body_filter_by_lua + Lua Regex. This involves using OpenResty’s Lua API to receive the entire response body and then performing string manipulation. This solution works for small files, but it requires buffering the complete response. Once the body exceeds tens of kilobytes, memory pressure and GC pauses become significant. The more fundamental issue is that Lua’s string operations are not designed for syntax-aware stream processing.
Approach 2: The sub_filter directive. Nginx’s native ngx_http_sub_module offers string replacement, but it performs literal matching and is not syntax-aware. Using it to remove comments could accidentally remove identical strings within the code, rendering it impractical.
Approach 3: Upstream Application-Layer Processing. This means adding a middleware in the backend service to handle minification. This pushes the responsibility for minification back to the application layer, circumventing the core constraint of not being able to modify the upstream. It also introduces additional service dependencies and latency.
These are not poor engineering decisions; they are reasonable attempts given the constraints. However, none of them truly solve the trilemma of “streaming, syntax-aware processing, and O(1) memory usage.”
If you’re facing any of these scenarios or have been burned by the body_filter_by_lua approach, read on to see how we tackle this problem.
A Streaming FSM for Each Language, Compiled to Native Code
Understanding the structure of the problem is key to devising the right solution.
These three constraints—syntactic correctness, stream processing, and fixed memory—are not independent dimensions that can be optimized in isolation. A tangible tension exists between them:
- Achieving higher syntactic correctness typically requires more contextual state, which conflicts with the O(1) memory requirement.
- Stream processing demands incremental output, but certain syntax transformations (like rewriting relative paths) require lookahead, which is at odds with a pure streaming approach.
- Maintaining a complete parsing state for each language requires a dedicated engineering effort.
A solution that genuinely satisfies all three constraints requires designing a dedicated streaming finite-state machine for each of JS, CSS, and HTML, and compiling it into native code (as an Nginx filter module). This ensures that processing speed does not become a bottleneck in the request processing pipeline.
This is not a solution that can be built by composing existing tools; it must be designed from the ground up.
What 120 MB/s Actually Means in Production
openresty-minifiers is a proprietary library from OpenResty Inc., designed specifically for this use case. It contains three independent minifier modules for JS, CSS, and HTML, all implemented as Nginx output filters.
Core performance metrics:
- JS minifier single-core throughput: 120+ MB/s (on a Core i9-13900K)
- Time complexity: O(n), where n is the response body length
- Space complexity: O(1), using a fixed 8KB buffer by default
Public benchmark data is available here: https://openresty.org/misc/re/bench/
In practical engineering terms, these numbers mean:
A processing speed of 120 MB/s translates to a minification capability of about 960 Mbps per core. In a typical gigabit egress scenario, the single-core processing power is already on par with the egress bandwidth, meaning minification won’t become a performance bottleneck in the request path. For high-bandwidth scenarios of 10 Gbps or more, you can linearly scale the processing capability through multi-core parallelism.
O(1) memory means the module is safe for response bodies of any size—you don’t need to whitelist large files or worry about a single massive file crashing a worker. Memory usage is deterministic, which makes capacity planning straightforward.
Under the hood, the library uses or-regex, an in-house regex compiler from OpenResty Inc. based on DFA optimization algorithms. This is the key to achieving 120+ MB/s throughput while maintaining O(1) memory. If you’d like to verify these numbers in your own environment or assess if it’s a good fit for your traffic volume, feel free to contact our technical team.
Five-Minute Setup: A Configuration Example
This library is distributed as a private package and requires a valid subscription token. It depends on the replace-filter-plus module, and both can be installed together via your package manager.
# apt (Ubuntu/Debian)
sudo apt-get install -y openresty-minifiers replace-filter-plus-nginx-module-1.21.4
# yum (RHEL/CentOS)
sudo yum install -y openresty-minifiers replace-filter-plus-nginx-module-1.21.4
The configuration has two levels: global preloading (in the http block) and per-location activation (for specific routes).
Here is an example using a JS minifier:
http {
# Load the filter module
load_module /usr/local/openresty/nginx/modules/ngx_http_replace_filter_module.so;
# Pre-compiles the minification program during the init phase, so it doesn't impact request latency
replace_filter_preload /usr/local/openresty-minifiers/lib/min-js.so
/usr/local/openresty-minifiers/tpls/min-js.tpl;
init_by_lua_block { require "resty.replace" }
server {
location ~ \.js$ {
replace_filter_types application/javascript;
replace_filter_max_buffered_size 8k;
access_by_lua_block {
local ok, err = require "resty.replace".pick("min-js")
if not ok then
error("failed to pick replace prog: " .. err)
end
}
}
}
}
The CSS and HTML minifiers share an identical configuration structure. Simply replace the .so and .tpl paths and the pick() parameter with their min-css / min-html counterparts. All three minifiers can be loaded independently and used in any combination.
Four Things to Know Before Going Live
MIME types must match the upstream Content-Type. By default, replace_filter_types only processes text/html. If your upstream service returns text/javascript instead of the standard application/javascript, you must explicitly specify it in your configuration:
replace_filter_types application/javascript text/javascript;
We recommend using curl -I to check the actual Content-Type header returned by your upstream service before configuring the corresponding MIME types.
replace_filter_max_buffered_size rarely needs adjustment. The default value of 8 KB is sufficient for most scenarios. This parameter controls the maximum buffer size the filter module uses to handle syntax structures that span across buffers; it does not limit the size of the response body. Even a 10 MB JavaScript file will only consume a constant 8 KB of memory.
The Last-Modified header is removed by default. Since minification alters the response body, replace_filter_last_modified is set to clear by default. If your CDN caching strategy relies on the Last-Modified header for conditional requests, you should evaluate how this behavior might affect your cache hit rate. To preserve the header:
replace_filter_last_modified keep;
We recommend a gradual rollout, starting with a low-risk location. Even with a production-proven tool, it’s always a good practice to run any new filter module on a non-critical path for a few days. This allows you to verify that there are no compatibility issues with your upstream content before a full-scale rollout.
Conclusion
In summary, handling tasks like code minification at the Nginx runtime can yield significant benefits, but it also places extreme demands on low-level optimization capabilities.
The efficient minification solution discussed in this post is now packaged as the openresty-minifiers module and included in OpenResty XRay’s suite of proprietary libraries. This suite also features high-performance client libraries for Redis, HTTP, and Kafka implemented in C, an optimized LuaJIT engine, and a lock-free module for dynamic metrics—all designed to solve a common challenge: achieving a level of performance within the OpenResty runtime that is beyond the reach of typical open-source solutions. For a complete list, please see our Proprietary Libraries.
If you are facing similar challenges, please contact our engineering team via the “Contact Us” button in the bottom-right corner to inquire about deployment plans and subscription details.
About The Author
Yichun Zhang (Github handle: agentzh), is the original creator of the OpenResty® open-source project and the CEO of OpenResty Inc..
Yichun is one of the earliest advocates and leaders of “open-source technology”. He worked at many internationally renowned tech companies, such as Cloudflare, Yahoo!. He is a pioneer of “edge computing”, “dynamic tracing” and “machine coding”, with over 22 years of programming and 16 years of open source experience. Yichun is well-known in the open-source space as the project leader of OpenResty®, adopted by more than 40 million global website domains.
OpenResty Inc., the enterprise software start-up founded by Yichun in 2017, has customers from some of the biggest companies in the world. Its flagship product, OpenResty XRay, is a non-invasive profiling and troubleshooting tool that significantly enhances and utilizes dynamic tracing technology. And its OpenResty Edge product is a powerful distributed traffic management and private CDN software product.
As an avid open-source contributor, Yichun has contributed more than a million lines of code to numerous open-source projects, including Linux kernel, Nginx, LuaJIT, GDB, SystemTap, LLVM, Perl, etc. He has also authored more than 60 open-source software libraries.
















