起因
早上突然收到大量业务日志告警 都提示nacos不可用,检查完nacos 发现服务是正常的,通过nginx的日志和nacos的日志 确定是因为错误请求和nginx配置有关
错误请求
- nacos可以通过公网被访问到
- 42.236.10.106 为360爬虫地址
- 爬虫发起请求 GET /nacos/v1/cs/configs 因为缺少参数,导致nacos出现报错 返回500状态码
- 而nginx的配置proxy_next_upstream 的配置 又将错误请求转发到下一个server上,导致所有server都返回过500,ng则认为server都挂了,触发no live upstreams while connecting to upstream (nginx错误日志中)
nginx的错误重试配置:
1
| proxy_next_upstream error timeout invalid_header http_500 http_502 http_503 http_404;
|
nginx的请求日志如下
1 2 3 4
| 42.236.10.106 – - [13/Aug/2021:19:08:08 +0800] "GET /nacos/v1/cs/configs HTTP/1.1" 500 0.006 0.001, 0.002, 0.002 nacos.a.com 58 "-" "Mozilla/5.0 (Linux; U; Android 8.1.0; zh-CN; EML-AL00 Build/HUAWEIEML-AL00) AppleWebKit/537.36 (KHTML, like Gecko) Version/4.0 Chrome/57.0.2987.108 baidu.sogo.uc.UCBrowser/11.9.4.974 UWS/2.13.1.48 Mobile Safari/537.36 AliApp(DingTalk/4.5.11) com.alibaba.android.rimet/10487439 Channel/227200 language/zh-CN" - "-" "ip1:8848, ip2:8848, ip3:8848" 42.236.10.106 – - [13/Aug/2021:19:08:10 +0800] "GET /nacos/v1/cs/configs HTTP/1.1" 500 0.006 0.002, 0.001, 0.002 nacos.a.com 58 "http://baidu.com/" "Mozilla/5.0 (Linux; U; Android 8.1.0; zh-CN; EML-AL00 Build/HUAWEIEML-AL00) AppleWebKit/537.36 (KHTML, like Gecko) Version/4.0 Chrome/57.0.2987.108 baidu.sogo.uc.UCBrowser/11.9.4.974 UWS/2.13.1.48 Mobile Safari/537.36 AliApp(DingTalk/4.5.11) com.alibaba.android.rimet/10487439 Channel/227200 language/zh-CN" - "-" "ip1:8848, ip2:8848, ip3:8848" 42.236.10.75 – - [13/Aug/2021:19:08:26 +0800] "GET /nacos/v1/cs/configs HTTP/1.1" 500 0.009 0.002, 0.001, 0.006 nacos.a.com 58 "-" "Mozilla/5.0 (Linux; U; Android 8.1.0; zh-CN; EML-AL00 Build/HUAWEIEML-AL00) AppleWebKit/537.36 (KHTML, like Gecko) Version/4.0 Chrome/57.0.2987.108 baidu.sogo.uc.UCBrowser/11.9.4.974 UWS/2.13.1.48 Mobile Safari/537.36 AliApp(DingTalk/4.5.11) com.alibaba.android.rimet/10487439 Channel/227200 language/zh-CN" - "-" "ip1:8848, ip2:8848, ip3:8848" 42.236.10.75 – - [13/Aug/2021:19:08:26 +0800] "GET /nacos/v1/cs/configs HTTP/1.1" 502 0.001 0.001, 0.000 nacos.a.com 552 "http://baidu.com/" "Mozilla/5.0 (Linux; U; Android 8.1.0; zh-CN; EML-AL00 Build/HUAWEIEML-AL00) AppleWebKit/537.36 (KHTML, like Gecko) Version/4.0 Chrome/57.0.2987.108 baidu.sogo.uc.UCBrowser/11.9.4.974 UWS/2.13.1.48 Mobile Safari/537.36 AliApp(DingTalk/4.5.11) com.alibaba.android.rimet/10487439 Channel/227200 language/zh-CN" - "-" "ip3:8848, nacos-portal"
|
nacos日志
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31
| org.springframework.web.bind.MissingServletRequestParameterException: Required String parameter 'dataId' is not present at org.springframework.web.method.annotation.RequestParamMethodArgumentResolver.handleMissingValue(RequestParamMethodArgumentResolver.java:202) at org.springframework.web.method.annotation.AbstractNamedValueMethodArgumentResolver.resolveArgument(AbstractNamedValueMethodArgumentResolver.java:113) at org.springframework.web.method.support.HandlerMethodArgumentResolverComposite.resolveArgument(HandlerMethodArgumentResolverComposite.java:126) at org.springframework.web.method.support.InvocableHandlerMethod.getMethodArgumentValues(InvocableHandlerMethod.java:166) at org.springframework.web.method.support.InvocableHandlerMethod.invokeForRequest(InvocableHandlerMethod.java:134) at org.springframework.web.servlet.mvc.method.annotation.ServletInvocableHandlerMethod.invokeAndHandle(ServletInvocableHandlerMethod.java:102) at org.springframework.web.servlet.mvc.method.annotation.RequestMappingHandlerAdapter.invokeHandlerMethod(RequestMappingHandlerAdapter.java:895) at org.springframework.web.servlet.mvc.method.annotation.RequestMappingHandlerAdapter.handleInternal(RequestMappingHandlerAdapter.java:800) at org.springframework.web.servlet.mvc.method.AbstractHandlerMethodAdapter.handle(AbstractHandlerMethodAdapter.java:87) at org.springframework.web.servlet.DispatcherServlet.doDispatch(DispatcherServlet.java:1038) at org.springframework.web.servlet.DispatcherServlet.doService(DispatcherServlet.java:942) at org.springframework.web.servlet.FrameworkServlet.processRequest(FrameworkServlet.java:1005) at org.springframework.web.servlet.FrameworkServlet.doGet(FrameworkServlet.java:897) at javax.servlet.http.HttpServlet.service(HttpServlet.java:634) at org.springframework.web.servlet.FrameworkServlet.service(FrameworkServlet.java:882) at javax.servlet.http.HttpServlet.service(HttpServlet.java:741) at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:231) at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:166) at org.apache.tomcat.websocket.server.WsFilter.doFilter(WsFilter.java:53) at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:193) at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:166) at org.springframework.web.filter.CorsFilter.doFilterInternal(CorsFilter.java:96) at org.springframework.web.filter.OncePerRequestFilter.doFilter(OncePerRequestFilter.java:107) at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:193) at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:166) at org.springframework.boot.actuate.web.trace.servlet.HttpTraceFilter.doFilterInternal(HttpTraceFilter.java:90) at org.springframework.web.filter.OncePerRequestFilter.doFilter(OncePerRequestFilter.java:107) at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:193) at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:166) at com.alibaba.nacos.core.auth.AuthFilter.doFilter(AuthFilter.java:60)
|
nginx作为反向代理服务器,后端RS有多台服务器,上层通过一定机制保证容错和负载均衡。
nginx的重试机制就是容错的一种
官方链接:http://nginx.org/en/docs/http/ngx_http_proxy_module.html#proxy_next_upstream
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17
| proxy_next_upstream error | timeout | invalid_header | http_500 | http_502 | http_503 | http_504 | http_403 | http_404 | http_429 | non_idempotent | off ...; Default: proxy_next_upstream error timeout; Context: http, server, location
指定应将请求传递到下一个服务器的情况: error timeout invalid_header http_500 http_502 http_503 http_504 http_403 http_404 http_429 non_idempotent off
|
下面还有一个参数影响重试次数,0表示不限制。:
1 2 3
| Syntax: proxy_next_upstream_tries number; Default: proxy_next_upstream_tries 0; Context: http, server, location
|
例子:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22
| upstream app-proxy { server 192.168.5.100:8080; server 192.168.5.101:8080; check interval=2000 rise=1 fall=3 timeout=3000 type=http; check_keepalive_requests 1;
check_http_send "GET /status/status.html HTTP/1.1\r\nConnection: close\r\nHost: localhost\r\n\r\n"; check_http_expect_alive http_2xx http_3xx; }
location / { proxy_pass http://app-proxy; proxy_next_upstream error timeout http_500 http_502 http_503 http_504; proxy_next_upstream_tries 3; proxy_connect_timeout 60s; proxy_read_timeout 60s; proxy_send_timeout 60s; proxy_pass_request_headers on; proxy_set_header Host $host:$server_port; proxy_set_header X-Real-IP $remote_addr; proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for; set $domain default;
|