反编译C code。与调试信息? [英] Decompile C code with debug info?

查看:169
本文介绍了反编译C code。与调试信息?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

Java和Python字节code是比较容易比C / C ++编译器生成编译机code反编译。

我无法找到令人信服的答案,为什么从-g选项的信息是不够的反编译,但足以进行调试?
什么是包含在Python / Java字节code多余的东西,这使得反编译容易?


解决方案

  

我无法找到令人信服的答案,为什么从-g选项的信息是不够的反编译,但足以让调试?


的调试信息基本上仅包含在生成的code中的地址和源文件行号之间的映射。调试器并不需要编译code - 它只是说明你的原始资料。如果源文件丢失,调试器将不会神奇地显示它们。

调试信息的这就是说,presence确实使反编译更加容易。如果调试信息包括使用类型和函数原型的布局,反编译可以使用它,并提供一个更precise反编译。在很多情况下,然而,它仍可能是从原始来源不同

例如,下面是用六角射线反编译反编译功能,而无需使用调试信息:

  INT __stdcall sub_4050A0(INT A1)
{
  INT结果; // @ EAX 1  结果= A1;
  如果(*(_ BYTE *)(A1 + 12))
  {
    结果= sub_404600(*(_ DWORD *)A1);
    *(_ BYTE *)(A1 + 12)= 0;
  }
  返回结果;
}

由于它不知道 A1 的类型,访问其字段重新presented作为补充和管型。

和这里的符号文件加载后相同的功能:

 无效__thiscall mytree :: write_page(mytree *此,PAGE * SRC)
{
  如果(src-> isChanged)
  {
    缓存:: set_changed(这 - >缓存,src-> baseAddr);
    src-> isChanged = 0;
  }
}

您可以看到,它已经改进了不少。

至于为什么反编译字节code是通常更容易,除了NPE的回答检查还这个

Java and Python byte code are relatively easy to decompile than compiled machine code generated by C/C++ compiler.

I am unable to find a convincing answer as to why the information from the -g option is insufficient for de-compilation, but sufficient for debugging? What is the extra stuff contained in Python/Java byte code, that makes decompilation easy?

解决方案

I am unable to find a convincing answer as to why the information from the -g option is insufficient for de-compilation, but sufficient for debugging?

The debugging information basically contains only mapping between the addresses in the generated code and the source files line numbers. The debugger does not need to decompile code - it just shows you the original sources. If the source files are missing, debugger won't magically show them.

That said, presence of debugging info does make decompilation easier. If the debug info includes the layout of the used types and function prototypes, the decompiler can use it and provide a much more precise decompilation. In many cases, however, it will still likely be different from the original source.

For example, here's a function decompiled with the Hex-Rays decompiler without using the debug info:

int __stdcall sub_4050A0(int a1)
{
  int result; // eax@1

  result = a1;
  if ( *(_BYTE *)(a1 + 12) )
  {
    result = sub_404600(*(_DWORD *)a1);
    *(_BYTE *)(a1 + 12) = 0;
  }
  return result;
}

Since it does not know the type of a1, the accesses to its fields are represented as additions and casts.

And here's the same function after the symbol file has been loaded:

void __thiscall mytree::write_page(mytree *this, PAGE *src)
{
  if ( src->isChanged )
  {
    cache::set_changed(this->cache, src->baseAddr);
    src->isChanged = 0;
  }
}

You can see that it's been improved quite a lot.

As for why decompiling bytecode is usually easier, in addition to NPE's answer check also this.

这篇关于反编译C code。与调试信息?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆