如何处理:java.util.concurrent.TimeoutException:android.os.BinderProxy.finalize()超时10秒后错误? [英] How to handle :java.util.concurrent.TimeoutException: android.os.BinderProxy.finalize() timed out after 10 seconds errors?

查看:15715
本文介绍了如何处理:java.util.concurrent.TimeoutException:android.os.BinderProxy.finalize()超时10秒后错误?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我们正看到一些 TimeoutExceptions GcWatcher.finalize,BinderProxy.finalize PlainSocketImpl.finalize 。 90 +%的人出现在Android 4.3。我们得到的这个报告从Crittercism的用户在外地。

该错误是一个变化: com.android.internal.BinderInternal $ GcWatcher.finalize()超时10秒后

  java.util.concurrent.TimeoutException:android.os.BinderProxy.finalize()超时10秒后
在android.os.BinderProxy.destroy(本机方法)
在android.os.BinderProxy.finalize(Binder.java:459)
在java.lang.Daemons $ FinalizerDaemon.doFinalize(Daemons.java:187)
在java.lang.Daemons $ FinalizerDaemon.run(Daemons.java:170)
在java.lang.Thread.run(Thread.java:841)
 

到目前为止,我们还没有得到任何运气再现的房子问题,或者找出什么可能导致它。

任何想法是什么原因? 任何想法如何调试这一点,并找出其中的应用程序的一部分原因造成的? 凡是揭示的问题光有帮助。

更多踪迹:

  1 android.os.BinderProxy.destroy
2 android.os.BinderProxy.finalize Binder.java,线482
3 java.lang.Daemons $ FinalizerDaemon.doFinalize Daemons.java,线187
4 java.lang.Daemons $ FinalizerDaemon.run Daemons.java,线170
5 java.lang.Thread.run Thread.java,线841
 

2

  1 java.lang.Object.wait
2 java.lang.Object.wait Object.java,线401
3 java.lang.ref.ReferenceQueue.remove ReferenceQueue.java,线102
4 java.lang.ref.ReferenceQueue.remove ReferenceQueue.java,行73
5 java.lang.Daemons $ FinalizerDaemon.run Daemons.java,线170
6 java.lang.Thread.run
 

3

  1 java.util.HashMap.newKeyIterator HashMap.java,线907
2的java.util.HashMap $ KeySet.iterator HashMap.java,线913
3 java.util.HashSet.iterator HashSet.java,线161
4 java.util.concurrent.ThreadPoolExecutor.interruptIdleWorkers ThreadPoolExecutor.java,线755
5 java.util.concurrent.ThreadPoolExecutor.interruptIdleWorkers ThreadPoolExecutor.java,线778
6 java.util.concurrent.ThreadPoolExecutor.shutdown ThreadPoolExecutor.java,线路1357
7 java.util.concurrent.ThreadPoolExecutor.finalize ThreadPoolExecutor.java,线路1443
8 java.lang.Daemons $ FinalizerDaemon.doFinalize Daemons.java,线187
9 java.lang.Daemons $ FinalizerDaemon.run Daemons.java,线170
10 java.lang.Thread.run
 

4

  1 com.android.internal.os.BinderInternal $ GcWatcher.finalize BinderInternal.java,行47
2 java.lang.Daemons $ FinalizerDaemon.doFinalize Daemons.java,线187
3 java.lang.Daemons $ FinalizerDaemon.run Daemons.java,线170
4 java.lang.Thread.run
 

解决方案

充分披露 - 我pviously在TLV droidcon提及谈话的$ P $的作者。 我有机会在许多Android应用程序来研究这个问题,并与谁遇到过其他的开发者讨论 - 我们都得到了相同的观点:不能回避这个问题,只能最小化

我仔细看看Android的垃圾收集code,其默认的实现,更好地理解为什么这个异常被抛出和什么可能是可能的原因。我甚至在实验过程中发现了一个可能的根本原因。 问题的根源在于,在一个设备进入睡眠状态了一段时间点 - 这意味着该操作系统已决定阻止大多数用户土地过程一会儿,并把屏幕关闭,从而降低CPU周期以降低电池消耗等方式做到这一点 - 是在哪里的过程都将暂停运行中旬Linux系统的水平。这可能发生在正常的应用程序执行过程中的任何时间,但它会停在在原生系统调用,因为上下文切换在内核级别进行。所以 - 这是在Dalvik GC加入的故事。 在Dalvik GC code(如在AOSP网站在Dalvik项目实施的)不是一个复杂的一块code。基本的方式,它的工作是覆盖在我的DroidCon幻灯片。我没有覆盖是基本的GC循环 - 在那里收集有对象名单敲定(和破坏)的地步。在基地的环路逻辑可以简化如下: 1.取starting_timestamp 2.删除对象的对象释放名单 3.发行对象 - 完成(),并调用本地的destroy(),如果需要的话。 4.取end_timestamp 5.计算(end_timestamp-starting_timestamp)和与之比较的10秒的硬codeD超时值。 6.如果超时已达到 - 抛出concurrent.timeout异常,并终止该进程

现在考虑以下情形: 应用沿尽自己的事情上运行。这不是一个用户面临应用,它运行在后台。在这样的背景下运行,对象的创建,使用和需要收集以释放存储器。应用程序不打扰一个Wakelock - 因为这会影响电池不利的,而且似乎没有必要。这意味着在应用程序将调用的GC不时。通常情况下,GC运行完成后顺利。有时(很少)的系统将决定在GC运行的中间进入睡眠状态。如果你运行你的应用程序足够长的时间就会发生这种情况,并密切监察在Dalvik内存的日志。现在 - 考虑基本的GC循环的时间戳逻辑 - 这是可能的装置开始运行,需要start_stamp,和睡觉在一个系统对象的destroy()方法调用本机。当它唤醒并继续运行时,destroy()方法将完成,接下来end_stamp将是它采取了销毁()调用+睡眠的时间。如果睡眠时间很长 - 超过10秒,concurrent.timeout会抛出异常。 我看这从分析python脚本生成的图 - 对于Android系统的应用程序,不只是我自己的监视应用程序。收集足够的日志,你最终会看到它。

底线: 不能回避的问题 - 如果您的应用程序在后台运行,你会遇到它。您可以通过采取wakelock睡觉减轻和prevent设备,但这是一个完全不同的故事,和一个新的头痛,而在另一个骗子也许再说话。 使得该方案不太可能 - 您可以通过减少GC调用最大限度地减少问题。提示是在幻灯片。

我尚未有机会投奔在Dalvik 2(又名ART)GC code - 它拥有一个新的世代压实功能,或者在一个棒棒糖操作系统进行的任何实验

补充2015年7月5日: 审查崩溃报告汇总这种碰撞类型后,它看起来像Android操作系统的版本5.0+这些崩溃(与ART棒棒糖),仅占本次大跌类型.5%。这意味着,对ART的GC变化已经减少这些崩溃的频率。

We're seeing a number of TimeoutExceptions in GcWatcher.finalize, BinderProxy.finalize, and PlainSocketImpl.finalize. 90+% of them happen on Android 4.3. We're getting reports of this from Crittercism from users out in the field.

The error is a variation of: "com.android.internal.BinderInternal$GcWatcher.finalize() timed out after 10 seconds"

java.util.concurrent.TimeoutException: android.os.BinderProxy.finalize() timed out after 10 seconds
at android.os.BinderProxy.destroy(Native Method)
at android.os.BinderProxy.finalize(Binder.java:459)
at java.lang.Daemons$FinalizerDaemon.doFinalize(Daemons.java:187)
at java.lang.Daemons$FinalizerDaemon.run(Daemons.java:170)
at java.lang.Thread.run(Thread.java:841)

So far we haven't had any luck reproducing the problem in house or figuring out what might have caused it.

Any ideas what can cause this? Any idea how to debug this and find out which part of the app causes this? Anything that sheds light on the issue helps.

More Stacktraces:

1   android.os.BinderProxy.destroy  
2   android.os.BinderProxy.finalize Binder.java, line 482
3   java.lang.Daemons$FinalizerDaemon.doFinalize    Daemons.java, line 187
4   java.lang.Daemons$FinalizerDaemon.run   Daemons.java, line 170
5   java.lang.Thread.run    Thread.java, line 841  

2

1   java.lang.Object.wait   
2   java.lang.Object.wait   Object.java, line 401
3   java.lang.ref.ReferenceQueue.remove ReferenceQueue.java, line 102
4   java.lang.ref.ReferenceQueue.remove ReferenceQueue.java, line 73
5   java.lang.Daemons$FinalizerDaemon.run   Daemons.java, line 170
6   java.lang.Thread.run

3

1   java.util.HashMap.newKeyIterator    HashMap.java, line 907
2   java.util.HashMap$KeySet.iterator   HashMap.java, line 913
3   java.util.HashSet.iterator  HashSet.java, line 161
4   java.util.concurrent.ThreadPoolExecutor.interruptIdleWorkers    ThreadPoolExecutor.java, line 755
5   java.util.concurrent.ThreadPoolExecutor.interruptIdleWorkers    ThreadPoolExecutor.java, line 778
6   java.util.concurrent.ThreadPoolExecutor.shutdown    ThreadPoolExecutor.java, line 1357
7   java.util.concurrent.ThreadPoolExecutor.finalize    ThreadPoolExecutor.java, line 1443
8   java.lang.Daemons$FinalizerDaemon.doFinalize    Daemons.java, line 187
9   java.lang.Daemons$FinalizerDaemon.run   Daemons.java, line 170
10  java.lang.Thread.run

4

1   com.android.internal.os.BinderInternal$GcWatcher.finalize   BinderInternal.java, line 47
2   java.lang.Daemons$FinalizerDaemon.doFinalize    Daemons.java, line 187
3   java.lang.Daemons$FinalizerDaemon.run   Daemons.java, line 170
4   java.lang.Thread.run

解决方案

Full disclosure - I'm the author of the previously mentioned talk in TLV droidcon. I had a chance to examine this issue across many Android applications, and discuss it with other developers who encountered it - and we all got to the same point: this issue cannot be avoided, only minimized.

I took a closer look at the default implementation of the Android Garbage collector code, to understand better why this exception is Thrown and on what could be the possible causes. I even found a possible root cause during experimentation. The root of the problem is at the point a device "Goes to Sleep" for a while - this means that the OS has decided to lower the Battery consumption by stopping most User Land processes for a while, and turning Screen off, reducing CPU cycles, etc. The way this is done - is on a Linux system level where the processes are Paused mid run. This can happen at any time during normal Application execution, but it will stop at at a Native system call, as the context switching is done on the kernel level. So - this is where the Dalvik GC joins the story. The dalvik GC code (as implemented in the Dalvik project in the AOSP site) is not a complicated piece of code. The basic way it work is covered in my DroidCon slides. what I did not cover is the Basic GC loop - at the point where the collector has a list of Objects to finalize (and destroy). the loop logic at the base can be simplified like this: 1. take starting_timestamp 2. remove object for list of objects to release 3. release object - finalize() and call native destroy() if required. 4. take end_timestamp 5. calculate (end_timestamp-starting_timestamp) and compare against a hard coded timeout value of 10 seconds. 6. if timeout has reached - throw the concurrent.timeout exception and kill the process.

Now consider the following scenario: Application runs along doing its thing. this is not a User facing application, it runs in the background. During this background operation, Objects are created, used and need to be collected to release memory. Application does not bother with a Wakelock - as this will affect the battery adversely, and seems unnecessary. this means the Application will invoke the GC from time to time. Normally the GC runs is completed without a hitch. Sometimes (very rarely) the System will decide to Sleep in the middle of the GC run. This will happen if you run your application long enough, and monitor the Dalvik memory logs closely. Now - consider the timestamp logic of the basic GC loop - it is possible for the device to start the run, take a start_stamp, and go to sleep at the destroy() native call on a system object. when it wakes up and resumes the run, the destroy() will finish, and the next end_stamp will be the time it took the destroy() call+the sleep time. If the sleep time was long - over 10 seconds, the concurrent.timeout exception will be thrown. I have seen this in the graphs generated from the analysis python script - for Android System Applications, not just my own monitored apps. collect enough logs, you will eventually see it.

Bottom line: The issue cannot be avoided - you will encounter it if your app runs in the background. You can mitigate by taking a wakelock, and prevent the device from sleeping, but that is a different story altogether, and a new headache, and maybe another talk in another con. You can minimize the problem by reducing GC calls - making the scenario less likely. tips are in the slides.

I have not yet had the chance to go over the Dalvik 2 (a.k.a ART) GC code - which boasts a new Generational Compacting feature, or performed any experiments on a Lolipop OS.

Added 7/5/2015: After reviewing the Crash reports aggregation for this crash type, it looks like these crashes from version 5.0+ of Android OS (lolipop with ART) only account for .5% of this crash type. This means that the ART GC changes has reduced the frequency of these crashes.

这篇关于如何处理:java.util.concurrent.TimeoutException:android.os.BinderProxy.finalize()超时10秒后错误?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
相关文章
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆