如何调试 SEGV_ACCERR [英] How to debug SEGV_ACCERR

查看:73
本文介绍了如何调试 SEGV_ACCERR的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个应用程序使用 KickflipButterflyTV libRTMP

现在应用程序在 99% 的情况下都可以正常工作,但有时我会遇到无法调试的本地分段错误,因为消息太神秘:

01-24 10:52:25.576 199-199/?A/DEBUG: *** *** *** *** *** *** *** *** *** *** *** *** *** *** ***01-24 10:52:25.576 199-199/?A/DEBUG:构建指纹:'google/hammerhead/hammerhead:6.0.1/M4B30Z/3437181:user/release-keys'01-24 10:52:25.576 199-199/?A/DEBUG:修订:'11'01-24 10:52:25.576 199-199/?A/调试:ABI:手臂"01-24 10:52:25.576 199-199/?A/DEBUG:pid:14302,tid:14382,名称:MuxerThread>>>tv.myapp.broadcast.dev <<<01-24 10:52:25.576 199-199/?A/DEBUG:信号 11 (SIGSEGV),代码 2 (SEGV_ACCERR),故障地址 0x9fef100001-24 10:52:25.636 199-199/?A/DEBUG:中止消息:正在准备就绪!"01-24 10:52:25.636 199-199/?A/调试:r0 9c6f9500 r1 9c6f94fc r2 9fee900c r3 00007ff401-24 10:52:25.636 199-199/?A/调试:r4 9fee9010 r5 9fef0ffd r6 00007ff1 r7 9fef0d8801-24 10:52:25.636 199-199/?A/调试:r8 cfe40980 r9 9e0a6900 sl 00007ff4 fp 9c6f94fc01-24 10:52:25.636 199-199/?A/DEBUG: ip 9c6f9058 sp 9c6f94dc lr 000000e9 pc b3a33cb6 cpsr 800f003001-24 10:52:25.650 199-199/?A/DEBUG:回溯:01-24 10:52:25.651 199-199/?A/调试:#00 pc 00004cb6/data/app/tv.myapp.broadcast.dev-2/lib/arm/librtmp-jni.so01-24 10:52:25.651 199-199/?A/DEBUG: #01 pc 00005189/data/app/tv.myapp.broadcast.dev-2/lib/arm/librtmp-jni.so (rtmp_sender_write_video_frame+28)01-24 10:52:25.651 199-199/?A/DEBUG: #02 pc 00005599/data/app/tv.myapp.broadcast.dev-2/lib/arm/librtmp-jni.so (Java_net_butterflytv_rtmp_1client_RTMPMuxer_writeVideo+60)01-24 10:52:25.651 199-199/?A/DEBUG: #03 pc 014e84e7/data/app/tv.myapp.broadcast.dev-2/oat/arm/base.odex (offset 0xa66000) (int net.butterflytv.rtmp_client.RTMPMuxer.writeVideo(byte[],整数,整数,整数)+122)01-24 10:52:25.651 199-199/?A/DEBUG: #04 pc 014dbd55/data/app/tv.myapp.broadcast.dev-2/oat/arm/base.odex (offset 0xa66000) (void io.kickflip.sdk.av.muxer.RtmpMuxerMix.writeThread()+2240)01-24 10:52:25.651 199-199/?A/DEBUG: #05 pc 014d8c41/data/app/tv.myapp.broadcast.dev-2/oat/arm/base.odex (offset 0xa66000) (void io.kickflip.sdk.av.muxer.RtmpMuxerMix.access$000(io.kickflip.sdk.av.muxer.RtmpMuxerMix)+60)01-24 10:52:25.651 199-199/?A/DEBUG: #06 pc 014d819f/data/app/tv.myapp.broadcast.dev-2/oat/arm/base.odex (offset 0xa66000) (void io.kickflip.sdk.av.muxer.RtmpMuxerMix$1.run()+98)01-24 10:52:25.651 199-199/?A/DEBUG: #07 pc 721e78d1/data/dalvik-cache/arm/system@framework@boot.oat (offset 0x1ed6000)

同样,在 2 小时的直播中,这可能永远不会发生,也可能会在直播 10 分钟后发生.调试起来非常困难,因为我不能强迫错误发生.

有什么办法可以改善我得到的调试信息?SEGV_ACCER 到底是什么意思?我读到这意味着您试图访问您无权访问的地址."但我不确定这意味着什么,因为我可以连续播放数小时而不会发生错误.

有什么方法可以捕捉到信号并继续?

添加更多信息,这是应用程序崩溃的本机库的一部分(使用 ndk-stack 找到):

JNIEXPORT jint JNICALLJava_net_butterflytv_rtmp_1client_RTMPMuxer_writeVideo(JNIEnv *env, jobject 实例,jbyteArray data_, jint 偏移量, jint 长度,jint 时间戳){jbyte *data = (*env)->GetByteArrayElements(env, data_, NULL);jint 结果 = rtmp_sender_write_video_frame(数据, 长度, 时间戳, 0, 0);(*env)->ReleaseByteArrayElements(env, data_, data, 0);返回结果;}int rtmp_sender_write_video_frame(uint8_t *data,整数大小,uint64_t dts_us,整数键,uint32_t abs_ts){uint8_t * buf;uint8_t * buf_offset;整数值 = 0;整数;uint32_t ts;uint32_t nal_len;uint32_t nal_len_n;uint8_t *nal;uint8_t *nal_n;字符 * 输出;uint32_t 偏移量 = 0;uint32_t body_len;uint32_t 输出长度;buf = 数据;buf_offset = 数据;总=大小;ts = (uint32_t)dts_us;//ts = RTMP_GetTime() - start_time;偏移量 = 0;nal = get_nal(&nal_len, &buf_offset, buf, 总计);(...)}静态 uint8_t * get_nal(uint32_t *len, uint8_t **offset, uint8_t *start, uint32_t 总计){uint32_t 信息;uint8_t *q ;uint8_t *p = *offset;*len = 0;如果((p - 开始)>= 总计)返回空值;而(1){info = find_start_code(p, 3);如果(信息 == 1)休息;p++;如果((p - 开始)>= 总计)返回空值;}q = p + 4;p = q;而(1){info = find_start_code(p, 3);如果(信息 == 1)休息;p++;如果((p - 开始)>= 总计)//返回空值;休息;}*len = (p - q);*偏移量 = p;返回q;}静态 uint32_t find_start_code(uint8_t *buf, uint32_t zeros_in_startcode){uint32_t 信息;uint32_t 我;信息 = 1;if ((info = (buf[zeros_in_startcode] != 1)? 0: 1) == 0)返回0;for (i = 0; i 

find_start_code 中的 buf[zeros_in_startcode] 发生崩溃.我还删除了一些 android_log 行(不认为这很重要?).

据我了解,这个缓冲区应该是可访问的,它只是有时"崩溃是没有意义的.

PS.这是我从 Java 调用本机代码的地方:

private void writeThread() {而(真){框架框架=空;同步(mBufferLock){if (!mConfigBuffer.isEmpty()) {帧 = mConfigBuffer.peek();} else if (!mBuffer.isEmpty()) {帧 = mBuffer.remove();}如果(帧==空){尝试 {mBufferLock.wait();} 捕捉(InterruptedException e){}}}如果(帧==空){继续;} else if (frame instanceof Sentinel) {休息;}int writeResult = 0;同步(mWriteFence){如果(!mConnected){调试(警告,由于断开连接而跳帧");继续;}if (frame.getFrameType() == Frame.VIDEO_FRAME) {writeResult = mRTMPMuxer.writeVideo(frame.getData(), frame.getOffset(), frame.getSize(), frame.getTime());} else if (frame.getFrameType() == Frame.AUDIO_FRAME) {writeResult = mRTMPMuxer.writeAudio(frame.getData(), frame.getOffset(), frame.getSize(), frame.getTime());}if (writeResult < 0) {mRtmpListener.onDisconnected();mConnected = 假;} 别的 {//现在我们删除配置框架,前提是发送成功!如果(frame.isConfig()){同步(mBufferLock){mConfigBuffer.remove();}}}}}}

请注意,即使我根本不发送音频也会发生崩溃.

解决方案

"您可以将数据存储在 byte[] 中.这样可以非常快速地从托管代码.但是,在本机方面,您不能保证无需复制即可访问数据."

参见 https://developer.android.com/training/articles/perf-jni.html

分析

一些思考和尝试:

  • 它失败的代码非常通用,所以可能没有错误
  • 一定是frame数据被移除/损坏/锁定/移动
  • Java 垃圾收集器是否已删除或重新定位数据?
  • 您可以将详细的调试信息写入文件,并在每个文件中覆盖它框架,因此您只有一个包含最后调试信息的小日志.
  • frame 变量信息的本地副本(使用 ByteBuffer)发送到 mRTMPMuxer.writeVideo
    与常规 byte 缓冲区不同,在 ByteBuffer 中,存储不分配在托管 heap 上,并且可以始终访问直接来自本机代码.

实施

<块引用>

//从本机堆分配内存ByteBuffer 数据 = ByteBuffer.allocateDirect(frame.getData().length);数据.clear();//System.gc();//复制数据data.get(frame.getData(), 0, frame.getData().length);//data = (frame.getData() == null) ?空:frame.getData().clone();int offset = frame.getOffset();int size = frame.getSize();int time = frame.getTime();writeResult = mRTMPMuxer.writeVideo(数据, 偏移量, 大小, 时间);JNIEXPORT jint JNICALLJava_net_butterflytv_rtmp_1client_RTMPMuxer_writeVideo(JNIEnv *env,作业实例,jobject data_,//不是 jbyteArray data_,jint偏移,接头长度,jint时间戳){jbyte *data = env->GetDirectBufferAddress(env, data);//GetDirectBufferAddress NOT GetByteArrayElementsjint 结果 = rtmp_sender_write_video_frame(数据, 长度, 时间戳, 0, 0);//(*env)->ReleaseByteArrayElements(env, data_, data, 0);//????返回结果;}

调试

一些代码来自 SO 捕捉本地代码抛出的异常:

 静态 uint32_t find_start_code(uint8_t *buf, uint32_t zeros_in_startcode){//...尝试 {if ((info = (buf[zeros_in_startcode] != 1)? 0: 1) == 0) return 0;//你的代码}//您可以捕获 std::exception 以进行更通用的错误处理捕获(标准::异常 e){throwJavaException (env, e.what());//见下面的方法}//...

然后是一个新方法:

 void throwJavaException(JNIEnv *env, const char *msg){//你可以把你自己的异常放在这里jclass c = env->FindClass("java/lang/RuntimeException");如果 (NULL == c){//B计划:空指针...c = env->FindClass("java/lang/NullPointerException");}env->ThrowNew(c, msg);}}

不要太挂在SEGV_ACCERR,你有一个分段错误,SIGSEGV(由一个程序试图读取或写入一个非法的内存位置,读在你的情况下).
来自 siginfo.h:

SEGV_MAPERR 表示您试图访问一个不映射到任何东西的地址.SEGV_ACCERR 表示您试图访问您无权访问的地址.

其他

这可能很有趣:

<块引用>

问:我注意到有 RTMP 支持.但是一个补丁删除RTMP 已被合并.
Q:你能告诉我为什么吗?
A:我们不认为 RTMP 服务于移动广播用例以及 HLS,
A:所以我们不想将我们有限的资源用于支持它.

参见:https://github.com/Kickflip/kickflip-android-sdk/issues/33

我建议您通过以下方式注册问题:
https://github.com/Kickflip/kickflip-android-sdk/issues
https://github.com/ButterflyTV/LibRtmp-Client-for-Android/问题

I have an app that streams video using Kickflip and ButterflyTV libRTMP

Now for 99% percent of the time the app is working ok, but from time to time I get a native segmentation fault that I am not able to debug, since messages are too cryptic:

01-24 10:52:25.576 199-199/? A/DEBUG: *** *** *** *** *** *** *** *** *** *** *** *** *** *** *** ***
01-24 10:52:25.576 199-199/? A/DEBUG: Build fingerprint: 'google/hammerhead/hammerhead:6.0.1/M4B30Z/3437181:user/release-keys'
01-24 10:52:25.576 199-199/? A/DEBUG: Revision: '11'
01-24 10:52:25.576 199-199/? A/DEBUG: ABI: 'arm'
01-24 10:52:25.576 199-199/? A/DEBUG: pid: 14302, tid: 14382, name: MuxerThread  >>> tv.myapp.broadcast.dev <<<
01-24 10:52:25.576 199-199/? A/DEBUG: signal 11 (SIGSEGV), code 2 (SEGV_ACCERR), fault addr 0x9fef1000
01-24 10:52:25.636 199-199/? A/DEBUG: Abort message: 'Setting to ready!'
01-24 10:52:25.636 199-199/? A/DEBUG:     r0 9c6f9500  r1 9c6f94fc  r2 9fee900c  r3 00007ff4
01-24 10:52:25.636 199-199/? A/DEBUG:     r4 9fee9010  r5 9fef0ffd  r6 00007ff1  r7 9fef0d88
01-24 10:52:25.636 199-199/? A/DEBUG:     r8 cfe40980  r9 9e0a6900  sl 00007ff4  fp 9c6f94fc
01-24 10:52:25.636 199-199/? A/DEBUG:     ip 9c6f9058  sp 9c6f94dc  lr 000000e9  pc b3a33cb6  cpsr 800f0030
01-24 10:52:25.650 199-199/? A/DEBUG: backtrace:
01-24 10:52:25.651 199-199/? A/DEBUG:     #00 pc 00004cb6  /data/app/tv.myapp.broadcast.dev-2/lib/arm/librtmp-jni.so
01-24 10:52:25.651 199-199/? A/DEBUG:     #01 pc 00005189  /data/app/tv.myapp.broadcast.dev-2/lib/arm/librtmp-jni.so (rtmp_sender_write_video_frame+28)
01-24 10:52:25.651 199-199/? A/DEBUG:     #02 pc 00005599  /data/app/tv.myapp.broadcast.dev-2/lib/arm/librtmp-jni.so (Java_net_butterflytv_rtmp_1client_RTMPMuxer_writeVideo+60)
01-24 10:52:25.651 199-199/? A/DEBUG:     #03 pc 014e84e7  /data/app/tv.myapp.broadcast.dev-2/oat/arm/base.odex (offset 0xa66000) (int net.butterflytv.rtmp_client.RTMPMuxer.writeVideo(byte[], int, int, int)+122)
01-24 10:52:25.651 199-199/? A/DEBUG:     #04 pc 014dbd55  /data/app/tv.myapp.broadcast.dev-2/oat/arm/base.odex (offset 0xa66000) (void io.kickflip.sdk.av.muxer.RtmpMuxerMix.writeThread()+2240)
01-24 10:52:25.651 199-199/? A/DEBUG:     #05 pc 014d8c41  /data/app/tv.myapp.broadcast.dev-2/oat/arm/base.odex (offset 0xa66000) (void io.kickflip.sdk.av.muxer.RtmpMuxerMix.access$000(io.kickflip.sdk.av.muxer.RtmpMuxerMix)+60)
01-24 10:52:25.651 199-199/? A/DEBUG:     #06 pc 014d819f  /data/app/tv.myapp.broadcast.dev-2/oat/arm/base.odex (offset 0xa66000) (void io.kickflip.sdk.av.muxer.RtmpMuxerMix$1.run()+98)
01-24 10:52:25.651 199-199/? A/DEBUG:     #07 pc 721e78d1  /data/dalvik-cache/arm/system@framework@boot.oat (offset 0x1ed6000)

Again, in a 2 hour stream this might not ever happen or it might happen 10 minutes into the stream. It is super hard to debug because I cannot force the bug to happen.

Is there any way to improve the debugging information I get? What exactly does SEGV_ACCER mean? I've read that this "means you tried to access an address that you don't have permission to access." but I am unsure as what that means, as I can stream for hours without the bug happening.

Is there any way to catch the signal and just continue?

EDIT: to add more information, this is the part of the native library where the app crashes (found using ndk-stack):

JNIEXPORT jint JNICALL
Java_net_butterflytv_rtmp_1client_RTMPMuxer_writeVideo(JNIEnv *env, jobject instance,
                                                       jbyteArray data_, jint offset, jint length,
                                                       jint timestamp) {
    jbyte *data = (*env)->GetByteArrayElements(env, data_, NULL);
    jint result = rtmp_sender_write_video_frame(data, length, timestamp, 0, 0);
    (*env)->ReleaseByteArrayElements(env, data_, data, 0);

    return result;
}


int rtmp_sender_write_video_frame(uint8_t *data,
                                  int size,
                                  uint64_t dts_us,
                                  int key,
                                  uint32_t abs_ts)
{


    uint8_t * buf;
    uint8_t * buf_offset;
    int val = 0;
    int total;
    uint32_t ts;
    uint32_t nal_len;
    uint32_t nal_len_n;
    uint8_t *nal;
    uint8_t *nal_n;
    char *output ;
    uint32_t offset = 0;
    uint32_t body_len;
    uint32_t output_len;

    buf = data;
    buf_offset = data;
    total = size;
    ts = (uint32_t)dts_us;

    //ts = RTMP_GetTime() - start_time;
    offset = 0;

    nal = get_nal(&nal_len, &buf_offset, buf, total);

(...)


}



static uint8_t * get_nal(uint32_t *len, uint8_t **offset, uint8_t *start, uint32_t total)
{
    uint32_t info;
    uint8_t *q ;
    uint8_t *p  =  *offset;
    *len = 0;




    if ((p - start) >= total)
        return NULL;

    while(1) {
        info =  find_start_code(p, 3);

        if (info == 1)
            break;
        p++;
        if ((p - start) >= total)
            return NULL;
    }
    q = p + 4;
    p = q;

    while(1) {
        info =  find_start_code(p, 3);

        if (info == 1)
            break;
        p++;
        if ((p - start) >= total)
            //return NULL;
            break;
    }


    *len = (p - q);
    *offset = p;
    return q;
}


static uint32_t find_start_code(uint8_t *buf, uint32_t zeros_in_startcode)
{
    uint32_t info;
    uint32_t i;

    info = 1;
    if ((info = (buf[zeros_in_startcode] != 1)? 0: 1) == 0)
        return 0;

    for (i = 0; i < zeros_in_startcode; i++)
        if (buf[i] != 0)
        {
            info = 0;
            break;
        };

    return info;
}

Crash happens at buf[zeros_in_startcode] in find_start_code. I have removed a few android_log lines as well (dont think this matters?).

To my understanding, this buffer should be accessible, it makes no sense that it crashes only "sometimes".

PS. this is where I call the native code from Java:

private void writeThread() {

       while (true) {

           Frame frame = null;
           synchronized (mBufferLock) {
              if (!mConfigBuffer.isEmpty()) {
                   frame = mConfigBuffer.peek();
               } else if (!mBuffer.isEmpty()) {
                   frame = mBuffer.remove();
               }
               if (frame == null) {
                   try {
                       mBufferLock.wait();
                   } catch (InterruptedException e) {
                   }
               }
           }

           if (frame == null) {
               continue;
           } else if (frame instanceof Sentinel) {
               break;
           }


           int writeResult = 0;

           synchronized (mWriteFence) {
               if (!mConnected) {
                   debug(WARN, "Skipping frame due to disconnection");
                   continue;
               }

               if (frame.getFrameType() == Frame.VIDEO_FRAME) {              
                   writeResult = mRTMPMuxer.writeVideo(frame.getData(), frame.getOffset(), frame.getSize(), frame.getTime());
               } else if (frame.getFrameType() == Frame.AUDIO_FRAME) {
                   writeResult = mRTMPMuxer.writeAudio(frame.getData(), frame.getOffset(), frame.getSize(), frame.getTime());

               }

               if (writeResult < 0) {
                       mRtmpListener.onDisconnected();
                       mConnected = false;
               } else {
                   //Now we remove the config frame, only if sending was successful!
                   if (frame.isConfig()) {
                       synchronized (mBufferLock) {
                           mConfigBuffer.remove();
                       }
                   }
               }
           }

       }

   }

Note that the crash happens even when I dont send audio at all.

解决方案

"You can store the data in a byte[]. This allows very fast access from managed code. On the native side, however, you're not guaranteed to be able to access the data without having to copy it."

See https://developer.android.com/training/articles/perf-jni.html

Analysis

Some musings and things to try:

  • The code where it falls over is very generic, so probably no bug there
  • It must be the frame data has been removed/damaged/locked/moved
  • Has the Java garbage collector removed OR relocated the data ?
  • You could write detailed debug to a file, overwriting it on every frame, so you only have a small log with the last debug info.
  • send a local copy of the frame variable info (using ByteBuffer) to mRTMPMuxer.writeVideo
    Unlike regular byte buffers,in ByteBuffer the storage is not allocated on the managed heap, and can always be accessed directly from native code.

Implementation

//allocates memory from the native heap
ByteBuffer data = ByteBuffer.allocateDirect(frame.getData().length);
data.clear();
//System.gc();
//copy data
data.get(frame.getData(), 0, frame.getData().length);
//data = (frame.getData() == null) ? null : frame.getData().clone();
int offset  = frame.getOffset();
int size    = frame.getSize();
int time    = frame.getTime();
writeResult = mRTMPMuxer.writeVideo(data , offset, size, time);

JNIEXPORT jint JNICALL
Java_net_butterflytv_rtmp_1client_RTMPMuxer_writeVideo(
    JNIEnv *env,
    jobject instance,
    jobject data_, //NOT jbyteArray data_,
    jint offset,
    jint length,
    jint timestamp) 
{
    jbyte *data = env->GetDirectBufferAddress(env, data);//GetDirectBufferAddress NOT GetByteArrayElements
    jint result = rtmp_sender_write_video_frame(data, length, timestamp, 0, 0);
    //(*env)->ReleaseByteArrayElements(env, data_, data, 0);//????
    return result;
}

Debugging

Some code from SO Catching exceptions thrown from native code:

    static uint32_t find_start_code(uint8_t *buf, uint32_t zeros_in_startcode){
    //...
    try {
        if ((info = (buf[zeros_in_startcode] != 1)? 0: 1) == 0) return 0;//your code
    }
    // You can catch std::exception for more generic error handling
    catch (std::exception e){
        throwJavaException (env, e.what());//see method below
    }
    //...

Then a new method:

    void throwJavaException(JNIEnv *env, const char *msg)
    {
     // You can put your own exception here
     jclass c = env->FindClass("java/lang/RuntimeException");
     if (NULL == c)
     {
         //B plan: null pointer ...
         c = env->FindClass("java/lang/NullPointerException");
     }
     env->ThrowNew(c, msg);
    }
}

Don't get too hung up on SEGV_ACCERR, you have a segmentation fault,SIGSEGV (caused by a program trying to read or write an illegal memory location, read in your case).
From siginfo.h:

SEGV_MAPERR means you tried to access an address that doesn't map to anything. SEGV_ACCERR means you tried to access an address that you don't have permission to access.

Other

This may be of interest:

Q: I noticed that there was RTMP support. But a patch which remove RTMP had been merged.
Q: Could you tell me why ?
A: We don't think RTMP serves the mobile broadcasting use case as well as HLS,
A: and so we don't want to dedicate our limited resources towards supporting it.

see: https://github.com/Kickflip/kickflip-android-sdk/issues/33

I suggest you register an issue with:
https://github.com/Kickflip/kickflip-android-sdk/issues
https://github.com/ButterflyTV/LibRtmp-Client-for-Android/issues

这篇关于如何调试 SEGV_ACCERR的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆