1.业务场景
业务需要,所有服务使用docker部署,在服务器重启后,systemctl管理的docker启动容器是并发的,业务服务启动有时候比中间件启动快,由于业务服务第一次启动时只连接中间一次,如果连接失败,就不会再重试了
2.修复方案
(1)业务服务增加重试连接次数(研发修改java代码)
(2)业务容器启动脚本增加对中间件服务的探测,探测后再启动,给中间件启动时间
- 下载wait-for-it.sh脚本
https://github.com/vishnubob/wait-for-it/archive/refs/heads/master.zip
unzip wait-for-it-master.zip
chmod +x wait-for-it.sh
- dockerfile文件
FROM jre-centos7:8u251
MAINTAINER xuxiaoming@zhengjue-ai.comCOPY entrypoint.sh *.jar /device-message/
WORKDIR /device-message/
HEALTHCHECK --interval=10s --timeout=30s --retries=3 CMD \curl --fail -sL -w "http_code:%{http_code} \n" \-o /dev/null http://localhost:8070/info || exit 1
RUN chmod +x /device-message/entrypoint.sh
ENTRYPOINT ["sh","/device-message/entrypoint.sh"]
- entrypoint.sh脚本
#!/bin/bashWAIT_FOR_IT="/device-message/wait-for-it.sh"
TIMEOUT=120SERVICES=("127.0.0.1:3306""127.0.0.1:6379""127.0.0.1:8848""127.0.0.1:9001""127.0.0.1:9200"
)log_with_timestamp() {echo "$(date +'%Y-%m-%d %H:%M:%S,%3N') $1"
}if [ -x "$WAIT_FOR_IT" ]; thenlog_with_timestamp "Starting dependency checks"for target in "${SERVICES[@]}"; do"$WAIT_FOR_IT" "$target" -t $TIMEOUTif [ $? -ne 0 ]; thenlog_with_timestamp "ERROR: Service $target not available within $TIMEOUT seconds" >&2exit 1fidonelog_with_timestamp "All dependencies are ready - starting application"
filog_with_timestamp "Launching Java application"exec java -Xmx2048m -Xms2048m -jar -XX:+PrintGCDateStamps -XX:+PrintGCDetails -Xloggc:/device-message/logs/gc-%t.log -Dfile.encoding=utf-8 -Duser.timezone=GMT+8 ./device-message-*.jar --spring.config.location=/device-message/config/
- 业务服务在启动时,会探测中间件服务的端口,超时为120s,可自定义
- docker logs -f device-message --tail 200
2025-07-28 12:28:23,052 Starting dependency checks
wait-for-it.sh: waiting 120 seconds for 127.0.0.1:23306
wait-for-it.sh: 127.0.0.1:23306 is available after 0 seconds
wait-for-it.sh: waiting 120 seconds for 127.0.0.1:26379
wait-for-it.sh: 127.0.0.1:26379 is available after 0 seconds
wait-for-it.sh: waiting 120 seconds for 127.0.0.1:28848
wait-for-it.sh: 127.0.0.1:28848 is available after 69 seconds
wait-for-it.sh: waiting 120 seconds for 127.0.0.1:29001
wait-for-it.sh: 127.0.0.1:29001 is available after 0 seconds
wait-for-it.sh: waiting 120 seconds for 127.0.0.1:29200
wait-for-it.sh: 127.0.0.1:29200 is available after 0 seconds
2025-07-28 12:29:32,354 All dependencies are ready - starting application
2025-07-28 12:29:32,356 Launching Java application