用Cython生成的可执行文件真的没有源代码吗? [英] Are executables produced with Cython really free of the source code?

查看:627
本文介绍了用Cython生成的可执行文件真的没有源代码吗?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我已阅读在Cython中制作可执行文件和BuvinJ对如何有效地对Python代码进行混淆?,并想测试源代码用Cython编译的代码实际上是不再存在了。编译后。的确,使用Cython是保护Python源代码的一种方式,确实很流行,例如,请参见文章使用Cython保护Python源。

I have read Making an executable in Cython and BuvinJ's answer to How to obfuscate Python code effectively? and would like to test if the source code compiled with Cython is really "no-more-there" after the compilation. It is indeed a popular opinion that using Cython is a way to protect a Python source code, see for example the article Protecting Python Sources With Cython.

让我们举一个简单的示例 test.pyx

Let's take this simple example test.pyx:

import json, time  # this will allow to see what happens when we import a library
print(json.dumps({'key': 'hello world'}))
time.sleep(3)
print(1/0)  # division error!

然后让我们使用Cython:

Then let's use Cython:

cython test.pyx --embed

这将生成 test.c 。让我们对其进行编译:

This produces a test.c. Let's compile it:

call "C:\Program Files (x86)\Microsoft Visual Studio 14.0\VC\vcvarsall.bat" x64
cl test.c /I C:\Python37\include /link C:\Python37\libs\python37.lib

有效!它会产生一个140KB的 test.exe 可执行文件,很好!

It works! It produces a 140KB test.exe executable, nice!

但是在此答案中如何有效地混淆Python代码?隐式地说,这种编译是指将隐藏源代码。 这似乎并不正确,如果运行 test.exe ,您将看到:

But in this answer How to obfuscate Python code effectively? it is said implicitly that this "compilation" will hide the source code. It does not seem true, if you run test.exe, you will see:

Traceback (most recent call last):
  File "test.pyx", line 4, in init test
    print(1/0)  # division error!         <-- the source code and even the comments are still there!
ZeroDivisionError: integer division or modulo by zero

这表明人类可读的源代码表单仍然存在

问题:有没有办法用Cython编译代码,以使声明不再是源代码是真的吗?

Question: Is there a way to compile code with Cython, such that the claim "the source code is no longer revealed" is true?

注意:我正在寻找一种既不提供源代码也不提供字节码(.pyc)的解决方案(如果嵌入了bytecode / .pyc,使用 uncompyle6

Note: I'm looking for a solution where neither the source code nor the bytecode (.pyc) is present (if the bytecode/.pyc is embedded, it's trivial to recover the source code with uncompyle6)

PS:我记得几年前做过同样的观察,但是在经过更深入的研究之后,我再也找不到它了:是否可以反编译a。 dll / .pyd文件提取Python源代码?

PS: I remembered I did the same observation a few years ago but I could not find it anymore, after deeper research here it is: Is it possible to decompile a .dll/.pyd file to extract Python Source Code?

推荐答案

该代码位于exe旁边的原始pyx文件中。

The code is found in the original pyx-file next to your exe. Delete/don't distribute this pyx-file with your exe.

当您查看生成的C代码时,请删除/不要在您的exe文件中分发该pyx文件。

When you look at the generated C-code, you will see why the error message is shown by your executable:

对于出现的错误,Cython将发出类似于以下内容的代码:

For a raised error, Cython will emit a code similar to the following:

__PYX_ERR(0, 11, __pyx_L3_error) 

其中 __ PYX_ERR 是定义为以下宏:

#define __PYX_ERR(f_index, lineno, Ln_error) \
{ \
  __pyx_filename = __pyx_f[f_index]; __pyx_lineno = lineno; __pyx_clineno = __LINE__; goto Ln_error; \
}

和变量 __ pyx_f 定义为

static const char *__pyx_f[] = {
  "test.pyx",
  "stringsource",
};

基本上 __ pyx_f [0] 告诉哪里原始代码可以找到。现在,当引发异常时,(嵌入式)Python解释器将查找您原始的pyx文件并找到相应的代码(可以在 __ Pyx_AddTraceback ,当出现错误时会被调用)。

Basically __pyx_f[0] tells where the original code could be found. Now, when an exception is raised, the (embedded) Python interpreter looks for your original pyx-file and finds the corresponding code (this can be looked up in __Pyx_AddTraceback which is called when an error is raised).

一旦这个pyx文件不存在,原始的源代码将不再为Python解释器/其他任何人所了解。但是,错误跟踪仍将显示函数的名称和行号,但不再显示任何代码段。

Once this pyx-file is not around, the original source code will no longer be known to the Python interpreter/anybody else. However, the error trace will still show the names of the functions and line-numbers but no longer any code snippets.

生成的可执行文件(如果创建一个扩展名,则为扩展名)。不包含任何字节码(如pyc文件中的内容),并且无法使用 uncompyle 之类的工具进行反编译:将py文件翻译成Python-opcode后生成字节码然后在 ceval中有一个巨大的循环中进行评估。 c 。但是对于内置/ cython模块,则不需要字节码,因为生成的代码直接使用Python的C-API,从而消除了对操作码进行评估的需要-这些模块会跳过解释,这就是它们的原因更快。因此,可执行文件中将没有字节码。

The resulting executable (or extension if one creates one) doesn't content any bytecode (as in pyc-files) and cannot be decompiled with tools like uncompyle: bytecode is produced when py-file is translated into Python-opcodes which are then evaluated in a huge loop in ceval.c. Yet for builtin/cython modules no bytecode is needed because the resulting code uses directly Python's C-API, cutting out the need to have/evaluate the opcodes - these modules skip interpretation, which a reason for them being faster. Thus no bytecode will be in the executable.

不过,有一个重要注意事项:应检查链接器是否不包含调试信息(因此,C代码中应包含调试信息)。 pyx文件的内容可以作为注释找到)。 具有 / Z7 选项的MSVC就是这样的示例。

One important note though: One should check that the linker doesn't include debug information (and thus the C-code where the pyx-file content can be found as comments). MSVC with /Z7 options is such an example.

但是,可以将生成的可执行文件反汇编到汇编器中,然后可以对生成的C代码进行逆向工程-因此虽然进行cythonizing可以使代码难以理解,但隐藏密钥或隐藏密钥不是正确的工具安全算法。

However, the resulting executable can be disassembled to assembler and then the generated C-code can be reverse engineered - so while cythonizing is Ok to make it hard to understand the code, it is not the right tool to conceal keys or security algorithms.

这篇关于用Cython生成的可执行文件真的没有源代码吗?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆