卡夫卡生产者回调异常 [英] Kafka producer callback Exception
问题描述
当我们生成消息时,我们可以定义一个回调,该回调可能会发生异常:
When we produce messages we can define a callback, this callback can expect an exception:
kafkaProducer.send(producerRecord, new Callback() {
public void onCompletion(RecordMetadata recordMetadata, Exception e) {
if (e == null) {
// OK
} else {
// NOT OK
}
}
});
考虑到生产者中的内置重试逻辑,我想知道开发者应该明确处理哪种异常?
Considered the buitl-in retry logic in the producer, I wonder which kind of exception should developers deal explicitly with?
推荐答案
According to the Callback Java Docs there are the following Exception possible happening during callback:
在处理此记录期间引发的异常.如果未发生错误,则为Null.可能引发的异常包括:
The exception thrown during processing of this record. Null if no error occurred. Possible thrown exceptions include:
不可恢复的异常(致命,永远不会发送消息):
Non-Retriable exceptions (fatal, the message will never be sent):
- InvalidTopicException
- OffsetMetadataTooLargeException
- RecordBatchTooLargeException
- RecordTooLargeException
- UnknownServerException
- InvalidTopicException
- OffsetMetadataTooLargeException
- RecordBatchTooLargeException
- RecordTooLargeException
- UnknownServerException
- CorruptRecordException
- InchvalidMetadataException
- NotEnoughReplicasAfterAppendException
- NotEnoughReplicasException
- OffsetOutOfRangeException
- TimeoutException
- UnknownTopicOrPartitionException
- CorruptRecordException
- InchvalidMetadataException
- NotEnoughReplicasAfterAppendException
- NotEnoughReplicasException
- OffsetOutOfRangeException
- TimeoutException
- UnknownTopicOrPartitionException
可恢复的异常(瞬态,可以通过增加#.retries来覆盖):
Retriable exceptions (transient, may be covered by increasing #.retries):
也许这不是一个令人满意的答案,但是最终 异常和如何处理异常完全取决于您的用例和业务需求.
Maybe this is a unsatisfactory answer, but in the end which Exceptions and how to handle them completely relies on your use case and business requirements.
但是,作为开发人员,您还需要处理Kafka Producer的重试机制本身.重试主要由以下因素驱动:
However, as a developer you also need to deal with the retry mechanism itself of the Kafka Producer. The retries are mainly driven by:
重试:将值设置为大于零将导致客户端重新发送其发送失败并带有潜在的临时错误的任何记录.请注意,此重试与客户端在收到错误后重新发送记录没有什么不同. 允许重试而不将max.in.flight.requests.per.connection(默认值:5)设置为1可能会更改记录的顺序,因为如果将两个批次发送到一个分区,则第一个发送失败并重试,但第二次成功,则第二批中的记录可能首先出现.另外请注意,如果由delivery.timeout.ms配置的超时在成功确认之前首先到期,则在重试次数用完之前,生产请求将失败. 用户通常应该更喜欢保留此配置,而使用delivery.timeout.ms来控制重试行为.
retries: Setting a value greater than zero will cause the client to resend any record whose send fails with a potentially transient error. Note that this retry is no different than if the client resent the record upon receiving the error. Allowing retries without setting max.in.flight.requests.per.connection (default: 5) to 1 will potentially change the ordering of records because if two batches are sent to a single partition, and the first fails and is retried but the second succeeds, then the records in the second batch may appear first. Note additionally that produce requests will be failed before the number of retries has been exhausted if the timeout configured by delivery.timeout.ms expires first before successful acknowledgement. Users should generally prefer to leave this config unset and instead use delivery.timeout.ms to control retry behavior.
retry.backoff.ms :尝试重试对给定主题分区的失败请求之前要等待的时间.这样可以避免在某些失败情况下,在紧密的循环中重复发送请求.
retry.backoff.ms: The amount of time to wait before attempting to retry a failed request to a given topic partition. This avoids repeatedly sending requests in a tight loop under some failure scenarios.
request.timeout.ms :该配置控制客户端等待请求响应的最长时间.如果超时之前仍未收到响应,则客户端将在必要时重新发送请求,如果重试已用尽,则客户端将使请求失败.该值应大于copy.lag.time.max.ms(代理配置),以减少由于不必要的生产者重试而导致消息重复的可能性.
request.timeout.ms: The configuration controls the maximum amount of time the client will wait for the response of a request. If the response is not received before the timeout elapses the client will resend the request if necessary or fail the request if retries are exhausted. This should be larger than replica.lag.time.max.ms (a broker configuration) to reduce the possibility of message duplication due to unnecessary producer retries.
建议将上述三种配置的默认值保留在上面,而将重点放在由...定义的硬上限上
The recommendation is to keep the default values of those three configurations above and rather focus on the hard upper time limit defined by
delivery.timeout.ms :在send()调用返回后,报告成功或失败时间的上限.这限制了记录在发送之前将被延迟的总时间,等待来自代理的确认的时间(如果期望)以及允许可重发的发送失败的时间.如果遇到不可恢复的错误,重试已用尽,或将记录添加到已达到较早的交付到期期限的批次,则生产者可能会报告未能在此配置之前发送记录失败.此配置的值应大于或等于
request.timeout.ms
和linger.ms
的总和.
delivery.timeout.ms: An upper bound on the time to report success or failure after a call to send() returns. This limits the total time that a record will be delayed prior to sending, the time to await acknowledgement from the broker (if expected), and the time allowed for retriable send failures. The producer may report failure to send a record earlier than this config if either an unrecoverable error is encountered, the retries have been exhausted, or the record is added to a batch which reached an earlier delivery expiration deadline. The value of this config should be greater than or equal to the sum of
request.timeout.ms
andlinger.ms
.
这篇关于卡夫卡生产者回调异常的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!