Nginx_proxy_next_upstream引发的服务不可用

起因

早上突然收到大量业务日志告警 都提示nacos不可用,检查完nacos 发现服务是正常的,通过nginx的日志和nacos的日志 确定是因为错误请求和nginx配置有关

错误请求

  • nacos可以通过公网被访问到
  • 42.236.10.106 为360爬虫地址
  • 爬虫发起请求 GET /nacos/v1/cs/configs 因为缺少参数,导致nacos出现报错 返回500状态码
  • 而nginx的配置proxy_next_upstream 的配置 又将错误请求转发到下一个server上,导致所有server都返回过500,ng则认为server都挂了,触发no live upstreams while connecting to upstream (nginx错误日志中)

nginx的错误重试配置:

1
proxy_next_upstream error timeout invalid_header http_500 http_502 http_503 http_404;

nginx的请求日志如下

1
2
3
4
42.236.10.106  - [13/Aug/2021:19:08:08 +0800] "GET /nacos/v1/cs/configs HTTP/1.1" 500 0.006 0.001, 0.002, 0.002 nacos.a.com 58 "-" "Mozilla/5.0 (Linux; U; Android 8.1.0; zh-CN; EML-AL00 Build/HUAWEIEML-AL00) AppleWebKit/537.36 (KHTML, like Gecko) Version/4.0 Chrome/57.0.2987.108 baidu.sogo.uc.UCBrowser/11.9.4.974 UWS/2.13.1.48 Mobile Safari/537.36 AliApp(DingTalk/4.5.11) com.alibaba.android.rimet/10487439 Channel/227200 language/zh-CN" - "-" "ip1:8848, ip2:8848, ip3:8848"  
42.236.10.106 - [13/Aug/2021:19:08:10 +0800] "GET /nacos/v1/cs/configs HTTP/1.1" 500 0.006 0.002, 0.001, 0.002 nacos.a.com 58 "http://baidu.com/" "Mozilla/5.0 (Linux; U; Android 8.1.0; zh-CN; EML-AL00 Build/HUAWEIEML-AL00) AppleWebKit/537.36 (KHTML, like Gecko) Version/4.0 Chrome/57.0.2987.108 baidu.sogo.uc.UCBrowser/11.9.4.974 UWS/2.13.1.48 Mobile Safari/537.36 AliApp(DingTalk/4.5.11) com.alibaba.android.rimet/10487439 Channel/227200 language/zh-CN" - "-" "ip1:8848, ip2:8848, ip3:8848"
42.236.10.75 - [13/Aug/2021:19:08:26 +0800] "GET /nacos/v1/cs/configs HTTP/1.1" 500 0.009 0.002, 0.001, 0.006 nacos.a.com 58 "-" "Mozilla/5.0 (Linux; U; Android 8.1.0; zh-CN; EML-AL00 Build/HUAWEIEML-AL00) AppleWebKit/537.36 (KHTML, like Gecko) Version/4.0 Chrome/57.0.2987.108 baidu.sogo.uc.UCBrowser/11.9.4.974 UWS/2.13.1.48 Mobile Safari/537.36 AliApp(DingTalk/4.5.11) com.alibaba.android.rimet/10487439 Channel/227200 language/zh-CN" - "-" "ip1:8848, ip2:8848, ip3:8848"
42.236.10.75 - [13/Aug/2021:19:08:26 +0800] "GET /nacos/v1/cs/configs HTTP/1.1" 502 0.001 0.001, 0.000 nacos.a.com 552 "http://baidu.com/" "Mozilla/5.0 (Linux; U; Android 8.1.0; zh-CN; EML-AL00 Build/HUAWEIEML-AL00) AppleWebKit/537.36 (KHTML, like Gecko) Version/4.0 Chrome/57.0.2987.108 baidu.sogo.uc.UCBrowser/11.9.4.974 UWS/2.13.1.48 Mobile Safari/537.36 AliApp(DingTalk/4.5.11) com.alibaba.android.rimet/10487439 Channel/227200 language/zh-CN" - "-" "ip3:8848, nacos-portal"

nacos日志

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
org.springframework.web.bind.MissingServletRequestParameterException: Required String parameter 'dataId' is not present
at org.springframework.web.method.annotation.RequestParamMethodArgumentResolver.handleMissingValue(RequestParamMethodArgumentResolver.java:202)
at org.springframework.web.method.annotation.AbstractNamedValueMethodArgumentResolver.resolveArgument(AbstractNamedValueMethodArgumentResolver.java:113)
at org.springframework.web.method.support.HandlerMethodArgumentResolverComposite.resolveArgument(HandlerMethodArgumentResolverComposite.java:126)
at org.springframework.web.method.support.InvocableHandlerMethod.getMethodArgumentValues(InvocableHandlerMethod.java:166)
at org.springframework.web.method.support.InvocableHandlerMethod.invokeForRequest(InvocableHandlerMethod.java:134)
at org.springframework.web.servlet.mvc.method.annotation.ServletInvocableHandlerMethod.invokeAndHandle(ServletInvocableHandlerMethod.java:102)
at org.springframework.web.servlet.mvc.method.annotation.RequestMappingHandlerAdapter.invokeHandlerMethod(RequestMappingHandlerAdapter.java:895)
at org.springframework.web.servlet.mvc.method.annotation.RequestMappingHandlerAdapter.handleInternal(RequestMappingHandlerAdapter.java:800)
at org.springframework.web.servlet.mvc.method.AbstractHandlerMethodAdapter.handle(AbstractHandlerMethodAdapter.java:87)
at org.springframework.web.servlet.DispatcherServlet.doDispatch(DispatcherServlet.java:1038)
at org.springframework.web.servlet.DispatcherServlet.doService(DispatcherServlet.java:942)
at org.springframework.web.servlet.FrameworkServlet.processRequest(FrameworkServlet.java:1005)
at org.springframework.web.servlet.FrameworkServlet.doGet(FrameworkServlet.java:897)
at javax.servlet.http.HttpServlet.service(HttpServlet.java:634)
at org.springframework.web.servlet.FrameworkServlet.service(FrameworkServlet.java:882)
at javax.servlet.http.HttpServlet.service(HttpServlet.java:741)
at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:231)
at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:166)
at org.apache.tomcat.websocket.server.WsFilter.doFilter(WsFilter.java:53)
at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:193)
at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:166)
at org.springframework.web.filter.CorsFilter.doFilterInternal(CorsFilter.java:96)
at org.springframework.web.filter.OncePerRequestFilter.doFilter(OncePerRequestFilter.java:107)
at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:193)
at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:166)
at org.springframework.boot.actuate.web.trace.servlet.HttpTraceFilter.doFilterInternal(HttpTraceFilter.java:90)
at org.springframework.web.filter.OncePerRequestFilter.doFilter(OncePerRequestFilter.java:107)
at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:193)
at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:166)
at com.alibaba.nacos.core.auth.AuthFilter.doFilter(AuthFilter.java:60)

proxy_next_upstream配置 (下列内容转自:https://www.cnblogs.com/cyleon/p/11023229.html)

nginx作为反向代理服务器,后端RS有多台服务器,上层通过一定机制保证容错和负载均衡。

nginx的重试机制就是容错的一种

官方链接:http://nginx.org/en/docs/http/ngx_http_proxy_module.html#proxy_next_upstream

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
proxy_next_upstream error | timeout | invalid_header | http_500 | http_502 | http_503 | http_504 | http_403 | http_404 | http_429 | non_idempotent | off ...;
Default: proxy_next_upstream error timeout;
Context: http, server, location

指定应将请求传递到下一个服务器的情况:
error # 与服务器建立连接,向其传递请求或读取响应头时发生错误;
timeout # 在与服务器建立连接,向其传递请求或读取响应头时发生超时;
invalid_header # 服务器返回空的或无效的响应;
http_500 # 服务器返回代码为500的响应;
http_502 # 服务器返回代码为502的响应;
http_503 # 服务器返回代码为503的响应;
http_504 # 服务器返回代码504的响应;
http_403 # 服务器返回代码为403的响应;
http_404 # 服务器返回代码为404的响应;
http_429 # 服务器返回代码为429的响应(1.11.13);
non_idempotent # 通常,请求与 非幂等 方法(POST,LOCK,PATCH)不传递到请求是否已被发送到上游服务器(1.9.13)的下一个服务器; 启用此选项显式允许重试此类请求;
off # 禁用将请求传递给下一个服务器。

下面还有一个参数影响重试次数,0表示不限制。:

1
2
3
Syntax:     proxy_next_upstream_tries number;
Default: proxy_next_upstream_tries 0;
Context: http, server, location

例子:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
upstream app-proxy {
server 192.168.5.100:8080;
server 192.168.5.101:8080;
check interval=2000 rise=1 fall=3 timeout=3000 type=http;
check_keepalive_requests 1;
# check_http_send "HEAD /status/status.html HTTP/1.1\r\n\r\n";
check_http_send "GET /status/status.html HTTP/1.1\r\nConnection: close\r\nHost: localhost\r\n\r\n";
check_http_expect_alive http_2xx http_3xx;
}

location / {
proxy_pass http://app-proxy;
proxy_next_upstream error timeout http_500 http_502 http_503 http_504;
proxy_next_upstream_tries 3;
proxy_connect_timeout 60s;
proxy_read_timeout 60s;
proxy_send_timeout 60s;
proxy_pass_request_headers on;
proxy_set_header Host $host:$server_port;
proxy_set_header X-Real-IP $remote_addr;
proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
set $domain default;

Nginx_proxy_next_upstream引发的服务不可用
https://imwang77.github.io/2021/08/16/Nginx_proxy_next_upstream引发的服务不可用/
作者
imwang77
发布于
2021年8月16日
更新于
2021年8月16日
许可协议