Kafka KStream应用程序-临时文件清理 [英] Kafka KStream application - temp file cleanup

查看:113
本文介绍了Kafka KStream应用程序-临时文件清理的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

似乎我基于 KStream 的应用程序已经堆积了许多gB文件(.sst,Log.old.< stamp>等).

Seems that my KStream based application has been piling up many gBs of files (.sst, Log.old.<stamp>, etc).

这些会自己清理还是我需要密切注意的事情?要设置一些参数来剔除它们吗?

Will these get cleaned up on their own or is this something I need to keep an eye on? Some param to be set to cull them?

推荐答案

关于这些本地/临时文件:这些文件中的一些是应用程序状态,这些文件应占消耗的大部分空间.您的应用程序可能会堆积"许多GB的文件,这仅仅是因为您的应用程序实际上正在管理很多状态.如果删除这些文件,可以通过从Kafka重放状态更改日志来自动重建这些文件,但这可能需要一些时间.

About these local/temp files: Some of these files are application state, and those should account for the majority of space consumed. Your application may be "piling up" many GBs of files simply because your application is actually managing a lot of state. These files can be reconstructed (automatically) by replaying the state's changelog from Kafka if you delete them, but this may take some time.

这些会自己清理还是我需要密切注意的事情?要设置一些参数来剔除它们吗?

Will these get cleaned up on their own or is this something I need to keep an eye on? Some param to be set to cull them?

一些清理工作已经完成,但是正如我上面写的那样,文件很有可能是由于某种原因占用了该空间.也许您可以共享该应用程序的处理拓扑的摘要以及一些有关该应用程序处理的数据的信息,这可能有助于了解所占用的空间是正确的还是可能存在问题.

Some cleaning up is done, but as I wrote above most probably the files consume that space for a reason. Perhaps you can share a snippet of the app's processing topology as well as some info about the data the app processing, which might help to understand whether the consumed space seems about right or whether there might be an issue.

清理:Kafka的最新版本(0.10.0.1)现在随附适用于Kafka Streams的应用程序重置工具以及一些有助于清理/重置的随附API方法,请参见

Clean up: The latest version of Kafka (0.10.0.1) now ships with an application reset tool for Kafka Streams plus some accompanying API methods that help cleaning/resetting, see Data Reprocessing with Kafka Streams: Resetting a Streams Application. That said, I am not sure whether you are intending to clean up files because you have stopped the application and want to get rid of all the local data, or because you want to do some "garbage collection" while the app is still running. If it's about the latter (GC), then in general there's no need to -- the files are there for a good reason, and most probably will just be recreated.

这篇关于Kafka KStream应用程序-临时文件清理的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆