云数据流 - 增加JVM Xmx值 [英] Cloud Dataflow - Increase JVM Xmx Value

查看:122
本文介绍了云数据流 - 增加JVM Xmx值的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我们正试图在云端运行Google Cloud Dataflow作业,但我们一直在收到java.lang.OutOfMemoryError:Java heap space。



我们正在尝试处理大查询表中的6.1亿条记录,并将处理后的记录写入12个不同的输出(主+ 11侧输出)。

我们尝试将实例的数量增加到64个n1-standard-4实例,但我们仍然遇到问题。

虚拟机上的Xmx值似乎设置为〜4GB(-Xmx3951927296),即使实例有15GB内存。有没有办法增加Xmx值?



工作ID是 - 2015-06-11_21_32_32-16904087942426468793

解决方案

您无法直接设置堆大小。但是,Dataflow会根据机器类型缩放堆大小。您可以通过设置标志--machineType来选择一台拥有更多内存的机器。数据流有意限制堆大小以避免对洗牌机产生负面影响。

b

您的代码是否显式地从内存中的多个记录累积值?你预计4GB对于任何给定的记录是不够的吗?

数据流的内存需求应该随着单个记录的大小和你的代码在内存中缓冲的数据量而扩展。数据流的内存要求不应随记录数量而增加。


We are trying to run a Google Cloud Dataflow job in the cloud but we keep getting "java.lang.OutOfMemoryError: Java heap space".

We are trying to process 610 million records from a Big Query table and writing the processed records to 12 different outputs (main + 11 side outputs).

We have tried increasing our number of instances to 64 n1-standard-4 instances but we are still getting the issue.

The Xmx value on the VMs seem to be set at ~4GB(-Xmx3951927296), even though the instances have 15GB memory. Is there any way of increasing the Xmx Value?

The job ID is - 2015-06-11_21_32_32-16904087942426468793

解决方案

You can't directly set the heap size. Dataflow, however, scales the heap size with the machine type. You can pick a machine with more memory by setting the flag "--machineType". The heap size should increase linearly with the total memory of the machine type.

Dataflow deliberately limits the heap size to avoid negatively impacting the shuffler.

Is your code explicitly accumulating values from multiple records in memory? Do you expect 4GB to be insufficient for any given record?

Dataflow's memory requirements should scale with the size of individual records and the amount of data your code is buffering in memory. Dataflow's memory requirements shouldn't increase with the number of records.

这篇关于云数据流 - 增加JVM Xmx值的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆