数据损坏:错误在哪里‽ [英] Data corruption: Where's the bug‽

查看:92
本文介绍了数据损坏:错误在哪里‽的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

最后我已经知道问题出在哪里(请参阅下面的我自己的答案),但看起来我无法将问题标记为已回答.如果有人可以回答我在下面的答案中遇到的问题,即这是Cython中的错误还是Cython的预期行为,我将把那个标记为已接受,因为这将是最有用的教训,恕我直言.

Last edit: I've figured out what the problem was (see my own answer below) but I cannot mark the question as answered, it would seem. If someone can answer the questions I have in my answer below, namely, is this a bug in Cython or is this Cython's intended behavior, I will mark that answer as accepted, because that would be the most useful lesson to gain from this, IMHO.

首先,我必须首先说我已经尝试了三天,而我只是将头撞在墙上.尽我所能从文档中得知,我正在正确地做事.显然,我不能做正确的事,因为如果我做的话,我就不会有问题(对吗?).

Firstly, I have to start by saying that I have been trying to figure this out for three days, and I am just banging my head against the wall. As best as I can tell from the documentation, I am doing things correctly. Obviously, I can't be doing things correctly, though, because if I were, I wouldn't have a problem (right?).

无论如何,我正在为mcrypt绑定到Python.它应该同时适用于Python 2和Python 3(尽管未经Python 2测试).它在我的网站上可用链接,因为它太大了包含在帖子中,并且由于我不知道自己在做什么,所以我什至无法隔离可能是问题代码的地方.显示该问题的脚本也在我的网站上也是如此.该脚本仅提供100个块,但字母"a"(无论加密算法/加密模式使用的块大小如何)都不存在,并且作为往返的结果,当然应该得到一个块"a".但事实并非如此(总是如此).这是一次运行的输出:

In any event, I am working on a binding for mcrypt to Python. It should work with both Python 2 and Python 3 (though it's untested for Python 2). It's available on my site, linked because it is way too large to include in the post, and given that I don't know what I am doing wrong, I cannot even isolate what might be the problem code. The script that shows the problem is also on my site. The script just feeds 100 blocks of nothing but the letter "a" (in whatever block size the encryption algorithm/encryption mode uses), and of course should get a block of "a" as the result of roundtripping. But it does not (always). Here is output from a single run of it:

Wed Dec 15 10:35:44 EST 2010
test.py:5: McryptSecurityWarning: get_key() is not recommended
  return ''.join(['{:02x}'.format(x) for x in o.get_key()])

key: b'\x01ez\xd5\xa9\xf9\x1f)\xa0G\xd2\xf2Z\xfc{\x7fn\x02?,\x08\x1c\xc8\x03\x061X\xb5\xc9\x99\xd0\xca'
key: b'\x01ez\xd5\xa9\xf9\x1f)\xa0G\xd2\xf2Z\xfc{\x7fn\x02?,\x08\x1c\xc8\x03\x061X\xb5\xc9\x99\xd0\xca'
16
self test result: 0
enc parameters: {'salt': '6162636465666768', 'mode': 'cbc', 'algorithm': 'rijndael-128', 'iv': '61626364616263646162636461626364'}
dec parameters: {'salt': '6162636465666768', 'mode': 'cbc', 'algorithm': 'rijndael-128', 'iv': '61626364616263646162636461626364'}
enc key: 01657ad5a9f91f29a047d2f25afc7b7f6e023f2c081cc803063158b5c999d0ca
dec key: 01657ad5a9f91f29a047d2f25afc7b7f6e023f2c081cc803063158b5c999d0ca
Stats: 88 / 100 good packets (88.0%)

#5: b'aaaaaaaaaaaaaaaa' != b'\xa6\xb8\xf9\td\x8db\xf6\x00Y"ST\xc6\x9b\xe7'
#6: b'aaaaaaaaaaaaaaaa' != b'aaaaaaa1\xb3@\x8d\xff\xf9\xafpy'
#13: b'aaaaaaaaaaaaaaaa' != b'\xb9\xc8\xaf\x1f\xb8\x8c\x0b_\x15s\x9d\xecN,*w'
#14: b'aaaaaaaaaaaaaaaa' != b'aaaaaaaaaaaaa\xeb?\x13'
#49: b'aaaaaaaaaaaaaaaa' != b'_C\xf2\x15\xd5k\xe1XKIF5k\x82\xa4\xec'
#50: b'aaaaaaaaaaaaaaaa' != b'aaaaaaaaaaa+\xdf>\x01\xee'
#74: b'aaaaaaaaaaaaaaaa' != b'\x1c\xdf0\x05\xc7\x0b\xe9\x93H\xc5B\xd7\xcfj+\x03'
#75: b'aaaaaaaaaaaaaaaa' != b'aaaaaaaaaaaaw+\xed\x0f'
#79: b'aaaaaaaaaaaaaaaa' != b"\xf2\x89\x1ct\xe1\xeeBWo\xb4-\xb9\x085'\xef"
#80: b'aaaaaaaaaaaaaaaa' != b'aaaaaaaaaaa\xcc\x01n\xf0<'
#91: b'aaaaaaaaaaaaaaaa' != b'g\x02\x08\xbf\xa5\xd7\x90\xc1\x84D\xf3\x9d$a)\x06'
#92: b'aaaaaaaaaaaaaaaa' != b'aaaaaaaaaaaaaaa\x01'

奇怪的是,对于给定的(算法,模式)对,它完全是相同的.我可以更改算法,这将导致不同的往返行程,但是当我不更改算法时,每次运行都始终相同.我绝对难过.同样,在上面的输出中可以看到,总是有连续两个块损坏了:块5和6、13和14等.因此,存在一种模式,但是由于某种原因,我无法弄清楚该模式所精确指向的是什么.

The weird part is that it is exactly the same for a given (algorithm, mode) pair. I can change the algorithm and it will result in different round-trips, but always the same for every run when I don't change the algorithm. I'm absolutely stumped. Also, it's always two blocks in a row that are corrupt as you can see in the output above: blocks 5 and 6, 13 and 14, etc. So, there is a pattern but I am, for whatever reason, unable to figure out what that pattern is pointing to precisely.

我意识到我在这里可能要问很多:我无法隔离一小段代码,并且可能需要熟悉mcrypt和Python. las,经过三天的思考,我需要稍微离开这个问题,所以我将其发布在这里,希望也许在我休整这个问题的时候,要么(a)有人将会看到我在哪里引入了错误,(b)当我稍后再回到问题时,我将能够看到我的错误,或者(c)某人或我自己可以找到问题,这也许不是我的代码中的错误,但是绑定过程或库本身中的错误.

I realize that I am probably asking a lot here: I can't isolate a small snip of code, and familiarity with both mcrypt and Python is probably required. Alas, after three days of hitting my head on this, I need to step away from the problem for a little bit, so I am posting this here in the hopes that maybe while I am taking a break from this problem either (a) someone will see where I introduced a bug, (b) I will be able to see my bug when I get back to the problem later, or (c) someone or myself can find the problem which maybe isn't a bug in my code but a bug in the binding process or the library itself.

我还没有做的一件事是尝试使用另一版本的mcrypt库.我正在使用Cython 0.13,Python 3.1和mcrypt 2.5.8进行工作,这些都是由Ubuntu在Ubuntu 10.10中分发的(Cython除外,我是从PyPi获得的).但是我使用PHP应用程序管理的系统运行良好,并且在Ubuntu 10.10上使用mcrypt而不会造成数据损坏,因此我没有理由相信这是mcrypt的构建,因此……就……我错了. ,我想.

One thing I haven't done is attempted to use another version of the mcrypt library. I'm doing my work with Cython 0.13, Python 3.1, and mcrypt 2.5.8, all as distributed by Ubuntu in Ubuntu 10.10 (except Cython, which I got from PyPi). But I manage systems with PHP applications that are functioning just fine and using mcrypt on Ubuntu 10.10 without data corruption, so I have no reason to believe that it is the build of mcrypt, so that just leaves… well, something wrong on my part somewhere, I think.

无论如何,我要感谢任何能提供帮助的人.我开始觉得自己快要疯了,因为几天来我一直在不断地解决这个问题,而且我感到解决方案可能就在我眼前,但是我看不到它.

In any case, I thank anyone profusely who can help. I'm starting to feel like I'm going crazy because I've been working on this problem pretty much non-stop for days and I get the feeling that the solution is probably right in front of me, but I cannot see it.

编辑:有人指出,我应该使用memcpy而不是strncpy.我这样做了,但是现在,测试脚本显示 every 块是不正确的.给我比以前更困惑的颜色...这是在pastebin上的新输出 .

Edit: Someone pointed out that I should be using memcpy instead of strncpy. I did that, but now, the test script shows that every block is incorrect. Color me even more confused than previously... here's the new output on pastebin.

编辑2 :我回到电脑上又看了一遍,我只是在各处添加打印语句,以查找可能出现问题的地方. raw_encrypt.step(input)函数中的以下代码:

Edit 2: I have come back to the computer and have been looking at it again, and I'm just adding print statements everywhere to find where things could be going wrong. The following code in the raw_encrypt.step(input) function:

    cdef char* buffer = <char*>malloc(in_len)
    print in_bin[:in_len]
    memcpy(buffer, <const_void *>in_bin, in_len)
    print "Before/after encryption"
    print buffer[:in_len]
    success = cmc.mcrypt_generic(self._mcStream, <void*>buffer, in_len)
    print buffer[:in_len]

第一个打印语句显示了预期的内容,即传入的纯文本.但是,第二个打印语句显示了完全不同的内容,应该完全相同.似乎Cython发生了一些我不完全了解的事情.

The first print statement shows the expected thing, the plaintext that is passed in. However, the second one shows something completely different, which it should be identical. It seems that there is something going on with Cython that I don't completely understand.

推荐答案

好,我不想这样做(回答我自己的问题),但是我找到了答案:这是Cython的怪癖,我将不得不调查(我不知道这是不是一个预期的怪癖,或者它是一个错误).

Oy, I hate to do this (answer my own question), but I found the answer: It is a quirk of Cython which I am going to have to look into (I don't know if it is an intended quirk, or if it is a bug).

问题与memcpy线有关.我将第二个参数强制转换为< const_void *> ;,该参数与pxd文件中的Cython定义相匹配,但是显然,这使Cython与使用< char *>编译代码的方式不同,后者迫使Cython将指针传递给实际字节数,而不是(我猜想?)指向Python对象/变量本身的指针.

The problem comes with the memcpy line. I cast the second parameter to <const_void*>, which matches the Cython definition in the pxd file, but apparently that makes Cython compile the code differently than using <char*>, the latter forcing Cython to pass a pointer to the actual bytes instead of (I guess?) a pointer to the Python object/variable itself.

所以,代替这个:

cdef char* buffer = <char*>malloc(in_len)
memcpy(buffer, <const_void *>in_bin, in_len)
success = cmc.mcrypt_generic(self._mcStream, <void*>buffer, in_len)

必须是这样:

cdef char* buffer = <char*>malloc(in_len)
memcpy(buffer, <char *>in_bin, in_len)
success = cmc.mcrypt_generic(self._mcStream, <void*>buffer, in_len)

一个奇怪的怪癖.老实说,我希望任何演员表都指向相同的位置,但是似乎演员表也可以影响行为.

What a strange quirk. I would honestly expect any cast to point to the same location, but it seems that the cast can affect behavior as well.

这篇关于数据损坏:错误在哪里‽的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆