之前学习了如何在Nginx中添加Lua模块 ,接下来就可以做实验完成自己的目标需求了,比如将上游服务的响应数据记录在日志中,方便后续的调试、审计等功能。
准备上游服务 首先准备一个上游服务的示例,在http块中添加如下内容:
在http块中添加log_format,作为记录上游服务响应日志的格式:
1 2 3 4 5 6 7 8 log_format bodylog escape=json '$remote_addr - [$time_local] ' '"$request" $status $body_bytes_sent ' '"$http_referer" "$http_user_agent" ' '<"$request_body"> "$resp_body"'; upstream geo { server mytest.com; }
当上游响应是json结构的数据时,Nginx默认会将引号打印成\x22
,后面再处理的时候比较麻烦,比如:
1 \x22status\x22: \x22success\x22
加上escape=json
后,会进行正确的转义:
在Nginx文档中有相关解析:
The escape parameter (1.11.8) allows setting json or default characters escaping in variables, by default, default escaping is used. The none value (1.13.10) disables escaping.
通过escape=json
也可以输出JSON形式的日志,比如:
1 2 3 4 5 6 7 8 9 10 log_format bodylog escape=json '{' '"remote_addr":"$remote_addr",' '"time_local":"$time_local",' '"request":"$request",' '"status": "$status",' '"body_bytes_sent":"$body_bytes_sent",' '"http_referrer":"$http_referer",' '"http_user_agent":"$http_user_agent"' '}';
可以查看日志输出:
1 { "remote_addr" : "192.168.1.2" , "time_local" : "07/Mar/2021:11:14:30 +0800" , "request" : "GET /geo/1.1.1.1 HTTP/1.1" , "status" : "200" , "body_bytes_sent" : "325" , "http_referrer" : "" , "http_user_agent" : "curl/7.55.1" "}
在server块中添加如下内容:
1 2 3 4 5 6 7 8 location ^~/geo/ { proxy_http_version 1.1; proxy_set_header Connection ""; proxy_set_header Host mytest.com; rewrite ^/geo/(.*)$ /json/$1 break; proxy_pass http://geo; access_log /usr/local/src/nginx/logs/geo.log bodylog; }
利用Lua记录上游响应结果 直接配置 在server块中添加如下内容:
1 2 3 4 5 6 7 8 9 10 lua_need_request_body on; set $resp_body ""; body_filter_by_lua ' local resp_body = ngx.arg[1] ngx.ctx.buffered = (ngx.ctx.buffered or "") .. resp_body if ngx.arg[2] then ngx.var.resp_body = ngx.ctx.buffered end ';
此处我记录的是整个响应结果,但如果响应过大可能会影响性能,可以做如下修改:
1 local resp_body = string .sub (ngx.arg [1 ], 1 , 1000 )
或者根据自己的需求上游响应数据进行过滤。
配置修改完成后reload,通过curl请求测试,查看/usr/local/src/nginx/logs/geo.log
文件,上游服务响应的结果已经记录:
1 192.168.1.2 - [06/Mar/2021:23:34:21 +0800] "GET /geo/1.1.1.1 HTTP/1.1" 200 325 "" "curl/7.55.1" <""> "{\"status\":\"success\",\"country\":\"Australia\",\"countryCode\":\"AU\",\"region\":\"QLD\",\"regionName\":\"Queensland\",\"city\":\"South Brisbane\",\"zip\":\"4101\",\"lat\":-27.4766,\"lon\":153.0166,\"timezone\":\"Australia/Brisbane\",\"isp\":\"Cloudflare, Inc\",\"org\":\"APNIC and Cloudflare DNS Resolver project\",\"as\":\"AS13335 Cloudflare, Inc.\",\"query\":\"1.1.1.1\"}"
此时nginx.conf主要配置如下:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 http { log_format bodylog escape=json '$remote_addr - [$time_local] ' '"$request" $status $body_bytes_sent ' '"$http_referer" "$http_user_agent" ' '<"$request_body"> "$resp_body"'; upstream geo { server mytest.com; } server { listen 80; server_name 192.168.1.1; lua_need_request_body on; set $resp_body ""; body_filter_by_lua ' local resp_body = ngx.arg[1] ngx.ctx.buffered = (ngx.ctx.buffered or "") .. resp_body if ngx.arg[2] then ngx.var.resp_body = ngx.ctx.buffered end '; location ^~/geo/ { proxy_http_version 1.1; proxy_set_header Connection ""; proxy_set_header Host mytest.com; rewrite ^/geo/(.*)$ /json/$1 break; proxy_pass http://geo; access_log /usr/local/src/nginx/logs/geo.log bodylog; } } }
脚本写入 这里可以将nginx.conf中相关的配置写入自定义路径的配置文件和lua脚本中,后期的维护也更加方便: 1、在conf目录下创建mywork文件夹,用于存在自己的文件; 2、将bodylog的日志格式写入到mywork的文件中,通过include
指令扩展; 3、将lua代码写入lua脚本; 4、修改nginx.conf文件:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 http { include mywork/log_upstream_resp.conf; upstream geo { server mytest.com; } server { listen 80; server_name 192.168.1.1; lua_need_request_body on; set $resp_body ""; body_filter_by_lua_file conf/mywork/log_upstream_resp.lua; location ^~/geo/ { proxy_http_version 1.1; proxy_set_header Connection ""; proxy_set_header Host mytest.com; rewrite ^/geo/(.*)$ /json/$1 break; proxy_pass http://geo; access_log /usr/local/src/nginx/logs/geo.log bodylog; } } }
重新reload后进行测试,效果与直接配置一致,但nginx.conf看起来更加简洁,便于维护,如果有多个server的话,也可以进行拆分,通过include
指令进行扩展。
error log 此前在业务上解决问题查找资料时,发现error_log
记录日志和上述access_log
的表现并不一样。在Github上找到了相同的issue,通过这个issue复现下,先来看一下配置文件:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 upstream mytest { server 192.168.0.113; } server { listen 8103; error_log /usr/local/src/nginx/logs/8103_error.log debug; location / { body_filter_by_lua_block { ngx.log(ngx.INFO,"ngx.arg[1]===========", ngx.arg[1]) ngx.log(ngx.INFO,"ngx.arg[2]===========", ngx.arg[2]) } proxy_pass http://mytest; } }
mytest
上游服务返回一串大的JSON数据,比如我这里使用https://raw.githubusercontent.com/grafana/grafana/main/package.json
,大小为12.8KB。当访问8103端口时,期望将数据响应记录在error日志中。
但是当实际访问8103端口时,body_filter_by_lua
多次调用,8103_error.log
显示ngx.arg[1]
只记录了一部分数据:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 $ cat 8103_error.log 2021/09/05 19:21:43 [info] 22342#0: *32 [lua] body_filter_by_lua:2: ngx.arg[1]==========={ "author": "Grafana Labs", "license": "AGPL-3.0-only", "private": true, "name": "grafana", "version": "8.2.0-pre", "repository": "github:grafana/grafana", "scripts": { "api-tests": "jest --notify --watch --config=devenv/e2e-api-tests/jest.js", "build": "node ./node_modules/webpack/bin/webpack.js --config scripts/webpack/webpack.prod.js", "dev": "webpack --progress --color --config scripts/webpack/webpack.dev.js", "e2e": "./e2e/start-and-run-suite", "e2e:debug": "./e2e/start-and-run-suite debug", "e2e:dev": "./e2e/start-and-run-suite dev", "test": "jest --notify --watch", "test:accessibility": "pa11y-ci --config .pa11yci.conf.js", "lint": "yarn run lint:ts && yarn run lint:sass", "lint:ts": "eslint . --ext .js,.tsx,.ts --cache", "lint:sass": "yarn run sass-lint -c public/sass/.sass-lint.yml 'public/sass/**/*.scss, packages/**/*.scss' -v -i '**/node_modules/**/*.scss'", "test:ci": "mkdir -p reports/junit && export JEST_JUNIT_OUTPUT_DIR=reports/junit && jest --ci --reporters=default --reporters=jest-junit -w ${TEST_MAX_WORKERS:-100%}", "lint:fix": "yarn lint --fix", "packages:build": "lerna run clean && lerna run build --ignore @grafana-plugins/input-datasource", "packages:docsExtract": "rm -rf ./reports/docs && lerna run docsExtract", "packages:docsToMarkdown": "api-documenter markdown --input-folder ./reports/docs/ --output-folder ./docs/sources/packages_api/ --hugo", "packages:prepare": "lerna version --no-push --no-git-tag-version --force-publish --exact", "packages:publish": "lerna publish from-package --contents dist", "packages:publishCanary": "lerna publish from-package --contents dist --dist-tag canary --yes", "packages:publishLatest": "lerna publish from-package --contents dist --yes", "packages:publishNext": "lerna publish from-package --contents dist --dist-tag next --yes", "packages:publishDev": "lerna publish fro 2021/09/05 19:21:43 [info] 22342#0: *32 [lua] body_filter_by_lua:3: ngx.arg[2]===========false while sending to client, client: 192.168.0.118, server: , request: "GET /package.json HTTP/1.1", upstream: "http://192.168.0.113:80/package.json", host: "192.168.0.118:8103" 2021/09/05 19:21:43 [info] 22342#0: *32 [lua] body_filter_by_lua:2: ngx.arg[1]=========== while sending to client, client: 192.168.0.118, server: , request: "GET /package.json HTTP/1.1", upstream: "http://192.168.0.113:80/package.json", host: "192.168.0.118:8103" 2021/09/05 19:21:43 [info] 22342#0: *32 [lua] body_filter_by_lua:3: ngx.arg[2]===========true while sending to client, client: 192.168.0.118, server: , request: "GET /package.json HTTP/1.1", upstream: "http://192.168.0.113:80/package.json", host: "192.168.0.118:8103" 2021/09/05 19:21:43 [info] 22342#0: *32 client 192.168.0.118 closed keepalive connection
造成这个问题的原因是Nginx的错误日志消息有硬编码长度限制,在标准Nginx核心中为2KB,在OpenResty的Nginx核心中为4KB。
Nginx src/core/ngx_log.h
中宏定义:
1 #define NGX_MAX_ERROR_STR 2048
OpenResty patches/nginx-1.19.9-larger_max_error_str.patch
中:
1 2 3 4 -#define NGX_MAX_ERROR_STR 2048 +#ifndef NGX_MAX_ERROR_STR +#define NGX_MAX_ERROR_STR 4096 +#endif
现在就知道问题所在了。
全缓冲 上面通过body_filter_by_lua
可以获取全部响应正文,但agentzh
春哥是不建议这么做的:
All the point of an output filter in nginx is to do the data processing in a non-buffered fashion, which is crucial for processing HUGE response body data streams. If you do full buffering yourself in body_filter_by_lua*, then you may blow up the LuaJIT’s GC-managed memory upon HUGE responses, and defeat all the purposes of nginx’s output filter design.
如果业务中必须获取到所有请求的响应数据之后进行处理(比如完整记录或者数据脱敏),那必须缓冲所有的数据块。春哥在OpenResty的Discussion中也给出了答案,我们可以根据业务情况自行修改:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 local chunk, eof = ngx.arg [1 ], ngx.arg [2 ]local buffered = ngx.ctx.bufferedif not buffered then buffered = {} ngx.ctx.buffered = buffered end if chunk ~= "" then buffered[#buffered + 1 ] = chunk ngx.arg [1 ] = nil end if eof then local whole = table .concat (buffered) ngx.ctx.buffered = nil ngx.arg [1 ] = whole ngx.var.resp_body = whole end
这种处理方式,比上面的更严谨,对变量进行了初始化,判断流是否已经结束等。另外,如果通过body_filter_by_lua*
修改响应体,并且修改了响应体的实际大小,需要注意置空Content-Length报头,header_filter_by_lua*
会在Nginx发送报头之前调用,不会误导客户端:
1 2 3 header_filter_by_lua ' ngx.header.content_length = nil ';
或者
1 2 ngx.header.content_length = nil
参考 1、https://serverfault.com/questions/361556/is-it-possible-to-log-the-response-data-in-nginx-access-log 2、http://nginx.org/en/docs/http/ngx_http_log_module.html#log_format 3、http://nginx.org/en/docs/ngx_core_module.html#include 4、https://stackoverflow.com/questions/25049667/how-to-generate-a-json-log-from-nginx 5、https://github.com/openresty/lua-nginx-module/issues/1092 6、https://forum.openresty.us/d/4462-7200de849c40c83a32394ae9cb27ceef/2 7、https://zhuanlan.zhihu.com/p/67904411 8、https://github.com/apache/apisix/blob/master/apisix/plugins/grpc-transcode/response.lua 9、https://jkzhao.github.io/2018/05/03/Lua-OpenResty%E4%BF%AE%E6%94%B9response-body/