当前位置: 首页 > news >正文

[Flink/Java] Flink Job 运行问题 FAQ

概述: Flink Job 运行问题 FAQ

历次处理Flink任务的错误情况,一般原因有:

  • 与第三方资源(数据库、OSS等)的网络不互通(搭建环境的早期阶段)、网络不稳定
  • 配置错误 (url / 用户名 / 密码; 大小写、空格等特殊字符)
  • 集群/队列的CU资源不足
  • Flink Job 的 JVM内存不足
  • Flink CDC Job中mysql binlog过期或失效
  • checkpoint保存失败
  • Flink 程序的业务逻辑、数据量太大:导致性能缓慢,导致 checkpoint 超时
  • Flink程序中的依赖组件(OSS、MYSQL、Redis、OLAP数据库等)不稳定/运行崩溃,导致 checkpoint 超时
  • ...
  • [Flink] Flink CDC FAQ - 博客园/千千寰宇

Q: Flink运行时报java.lang.IllegalStateException: Buffer pool is destroyed.

问题描述

  • Flink运行时报java.lang.IllegalStateException: Buffer pool is destroyed.

且这个报错在日志中与"Could not forward element to next operator"同时存在。

java.lang.RuntimeException: Buffer pool is destroyed.at org.apache.flink.streaming.runtime.io.RecordWriterOutput.pushToRecordWriter(RecordWriterOutput.java:110) ~[flink-streaming-java_2.11-1.8.1.jar:1.8.1]at org.apache.flink.streaming.runtime.io.RecordWriterOutput.collect(RecordWriterOutput.java:89) ~[flink-streaming-java_2.11-1.8.1.jar:1.8.1]at org.apache.flink.streaming.runtime.io.RecordWriterOutput.collect(RecordWriterOutput.java:45) ~[flink-streaming-java_2.11-1.8.1.jar:1.8.1]at org.apache.flink.streaming.api.operators.AbstractStreamOperator$CountingOutput.collect(AbstractStreamOperator.java:718) ~[flink-streaming-java_2.11-1.8.1.jar:1.8.1]at org.apache.flink.streaming.api.operators.AbstractStreamOperator$CountingOutput.collect(AbstractStreamOperator.java:696) ~[flink-streaming-java_2.11-1.8.1.jar:1.8.1]at org.apache.flink.streaming.api.operators.StreamMap.processElement(StreamMap.java:41) ~[flink-streaming-java_2.11-1.8.1.jar:1.8.1]at org.apache.flink.streaming.runtime.tasks.OperatorChain$CopyingChainingOutput.pushToOperator(OperatorChain.java:579) ~[flink-streaming-java_2.11-1.8.1.jar:1.8.1]at org.apache.flink.streaming.runtime.tasks.OperatorChain$CopyingChainingOutput.collect(OperatorChain.java:554) ~[flink-streaming-java_2.11-1.8.1.jar:1.8.1]at org.apache.flink.streaming.runtime.tasks.OperatorChain$CopyingChainingOutput.collect(OperatorChain.java:534) ~[flink-streaming-java_2.11-1.8.1.jar:1.8.1]at org.apache.flink.streaming.api.operators.AbstractStreamOperator$CountingOutput.collect(AbstractStreamOperator.java:718) ~[flink-streaming-java_2.11-1.8.1.jar:1.8.1]at org.apache.flink.streaming.api.operators.AbstractStreamOperator$CountingOutput.collect(AbstractStreamOperator.java:696) ~[flink-streaming-java_2.11-1.8.1.jar:1.8.1]at org.apache.flink.streaming.api.operators.StreamSourceContexts$NonTimestampContext.collect(StreamSourceContexts.java:104) ~[flink-streaming-java_2.11-1.8.1.jar:1.8.1]at com.ucarinc.framework.flink.connectors.flexq.FlexQSource.run(FlexQSource.java:204) ~[flink-connector-flexq-1.8.500-20191206.054312-28.jar:1.8.500-SNAPSHOT]at org.apache.flink.streaming.api.operators.StreamSource.run(StreamSource.java:93) ~[flink-streaming-java_2.11-1.8.1.jar:1.8.1]at org.apache.flink.streaming.api.operators.StreamSource.run(StreamSource.java:57) ~[flink-streaming-java_2.11-1.8.1.jar:1.8.1]at org.apache.flink.streaming.runtime.tasks.SourceStreamTask.run(SourceStreamTask.java:97) ~[flink-streaming-java_2.11-1.8.1.jar:1.8.1]at org.apache.flink.streaming.runtime.tasks.StreamTask.invoke(StreamTask.java:300) ~[flink-streaming-java_2.11-1.8.1.jar:1.8.1]at org.apache.flink.runtime.taskmanager.Task.run(Task.java:711) [flink-runtime_2.11-1.8.1.jar:1.8.1]at java.lang.Thread.run(Thread.java:745) [?:1.8.0_31]
Caused by: java.lang.IllegalStateException: Buffer pool is destroyed.at org.apache.flink.runtime.io.network.buffer.LocalBufferPool.internalRequestMemorySegment(LocalBufferPool.java:264) ~[flink-runtime_2.11-1.8.1.jar:1.8.1]at org.apache.flink.runtime.io.network.buffer.LocalBufferPool.requestMemorySegment(LocalBufferPool.java:240) ~[flink-runtime_2.11-1.8.1.jar:1.8.1]at org.apache.flink.runtime.io.network.buffer.LocalBufferPool.requestBufferBuilderBlocking(LocalBufferPool.java:218) ~[flink-runtime_2.11-1.8.1.jar:1.8.1]at org.apache.flink.runtime.io.network.api.writer.RecordWriter.requestNewBufferBuilder(RecordWriter.java:264) ~[flink-runtime_2.11-1.8.1.jar:1.8.1]at org.apache.flink.runtime.io.network.api.writer.RecordWriter.getBufferBuilder(RecordWriter.java:257) ~[flink-runtime_2.11-1.8.1.jar:1.8.1]at org.apache.flink.runtime.io.network.api.writer.RecordWriter.copyFromSerializerToTargetChannel(RecordWriter.java:177) ~[flink-runtime_2.11-1.8.1.jar:1.8.1]at org.apache.flink.runtime.io.network.api.writer.RecordWriter.emit(RecordWriter.java:162) ~[flink-runtime_2.11-1.8.1.jar:1.8.1]at org.apache.flink.runtime.io.network.api.writer.RecordWriter.emit(RecordWriter.java:128) ~[flink-runtime_2.11-1.8.1.jar:1.8.1]at org.apache.flink.streaming.runtime.io.RecordWriterOutput.pushToRecordWriter(RecordWriterOutput.java:107) ~[flink-streaming-java_2.11-1.8.1.jar:1.8.1]... 18 more

问题分析

  • 一般为为任务network buffer不足。可以调整下任务的network buffer的大小。

解决方法

  • 方法1:清理/腾出运行的计算机内存资源,尔后重新提交运行 (亲测有效)

  • 方法2:高级参数中添加:taskmanager.memory.network.fraction 0.2 (默认值为0.1,可根据实际情况适当调整)

参考文献

  • FAQ-Buffer pool is destroyed. - 网易-有数学堂/EasyData数据开发治理平台FAQ 【推荐】
  • Flink问题排查-Buffer pool is destroyed. - CSDN

X 参考文献

http://www.vanclimg.com/news/2925.html

相关文章:

  • CF 1093 Div
  • Linux Cgroups(Control Groups)限制不同用户的CPU和内存资源
  • 动态代码记录
  • 怎么查看系统的上下⽂切换情况 - BinBin
  • 7.30
  • C语言基础-练习:猜数字
  • 题解:CF1270G Subset with Zero Sum
  • BARRA CNE6
  • IMA-Appraisal 简单介绍
  • RoD-TAL:罗马尼亚驾照考试问答基准
  • charles破解
  • MySQL面试题及详细答案 155道(001-020) - 指南
  • @NotBlank、@NotEmpty、@NotNull
  • pid查询树形结构
  • C语言基础-随机数
  • 教育 AI 大事件!OpenAI 深夜甩出 ChatGPT Study,免费当你 24 小时私人家教
  • C语言基础-循环语句(循环结构)
  • 题解:AT_agc066_c [AGC066C] Delete AAB or BAA
  • CF2120D Matrix game 题解
  • DP - 数据结构优化
  • P1163 银行贷款-二分
  • PyTorch基础
  • Gitee Wiki重塑关键领域软件开发的知识管理范式
  • [原创]《C#高级GDI+实战:从零开发一个流程图》第08章:增加菱形、平行四边形、圆角矩形,文本居中显示
  • 题解:CF1458C Latin Square
  • 探究 AI 智能体:扣子空间的使用门槛与未来进化方向
  • CF1731D 题解
  • G. Unusual Entertainment 题解
  • centos+stress-ng+cgroup完整压力测试方案
  • 2.6 rt-thread实操 SConstruct解析