xor:怎么这么慢? [英] xor: how come so slow?
问题描述
我正在尝试编码字节数据。让我们不要关注
编码的过程;事实上,我想强调的是,方法
create_random_block需要0.5秒才能执行(即使Java更快)
a双核3.0Ghz机器:
花了46.746999979s,平均值:0.46746999979s
因此我认为字节之间的xor操作会提升执行力
时间到0.5;为什么我认为呢?
因为在Python中没有字节的支持,甚至对于xoring
字节,所以我使用了一种解决方法:
我在两个字符串上循环为xor-red
为字符串中的每个字符串
在整数上转换一个字符然后xor它们; (ord)
在结果中插入一个char,转换前一个整数
in char(chr)
我想这个ord()和char()是我的
实现的主要问题,但我不知道xor两个字节的数据是什么方式
(字节代表作为字符串)。
有关详细信息,请参阅附带的代码。
我应该如何减少执行时间?
谢谢
来自__future__进口部门的
导入随机
导入时间
import sha
导入os
类编码器(对象):
def create_random_block(self,data,seed,blocksize):
number_of_blocks = int(len(data)/ blocksize)
random.seed(种子)
random_block = [''0''] * blocksize >
表示范围内的索引(number_of_blocks):
if int(random.getrandbits(1))== 1:
block = data [blocksize * index :块大小*索引+块大小]
为范围内的位(len(块)):
random_block [bit] =
chr(ord(random_block [bit]) ^ ord(块[bit]))#factaround per fare xor
bit a bit di str; xor e''solo supportato per int -ord
return''''。join(random_block)
x = Encoder()
piece = os.urandom(1024 * 1024)
blocksize = 16384
t1 = time.time()
for l in range(100):
seed = random.getrandbits(32)
block = x.create_random_block(piece,seed,blocksize)
t2 = time.time( )
print''take''+ str(t2-t1)+'s,avg:''+ str((t2-t1)/100.0)+'s'''br />
我的回答是:永远不要用python做这样的事情。
你会发现这个模块很有用: www.pycrypto.org
10月15日,12:19 * pm,Michele< mich ... @ nectarine.itwrote:
我正在尝试编码一个字节数据。让我们不要关注
编码的过程;事实上,我想强调的是,方法
create_random_block需要0.5秒才能执行(即使Java更快)
a双核3.0Ghz机器:
花了46.746999979s,平均值:0.46746999979s
因此我认为字节之间的xor操作会提升执行力
时间到0.5;为什么我认为呢?
因为在Python中没有字节的支持,甚至对于xoring
字节,所以我使用了一种解决方法:
我在两个字符串上循环为xor-red
* *对于字符串中的每个字符串
* * * *在整数上转换一个字符然后xor它们; (ord)
* * * *在结果中插入一个字符,转换前一个整数
in char(chr)
我认为ord()和char()是我实现的b / b
的主要问题,但我不知道x或两个字节的数据是什么方式
(字节表示为字符串)。
有关详细信息,请参阅附带的代码。
我应该如何减少执行时间?
谢谢
来自__future__进口部门的
导入随机
导入时间
import sha
import os
class Encoder(object):
* * def create_random_block(self,data, seed,blocksize):
* * * * number_of_blocks = int(len(data)/ blocksize)
* * * * random.seed(种子)
* * * * random_block = [''0''] * blocksize
* * * *表示范围内的索引(number_of_blocks):
* * * * * * if int(random.getrandbits(1))== 1:
* * * * * * * * block = data [blocksize * index:blocksize * index + blocksize]
* * * * * * * *为范围内的位数(len(块) ):
* * * * * * * * * * random_block [bit] =
chr(ord(random_block [bit])^ ord(block [bit]) )#factaround per fare xor
bit bit bit str; xor e''solo supportato per int -ord
* * * * return''''。join(random_block)
x = Encoder()
piece = os.urandom(1024 * 1024)
blocksize = 16384
t1 = time.time()
for l in range(100):
* * seed = random.getrandbits(32)
* * block = x.create_random_block(piece,seed,blocksize)
t2 = time.time()
print''take''+ str(t2-t1)+'s,avg:''+ str((t2-t1 )/100.0)+''''
对您的代码的建议很少:
- 使用xrange而不是range。
- 循环列表,你可以代替它们的索引。
- array.array(" B",somestring)可以帮助你,因为它给了一个字节
" view"一个字符串。
- 使用psyco对这类代码有很大的帮助。
- 我认为numpy数组也可以包含text / chars,所以它可能会为你提供
方法可以大大加快你的代码。
- 一般来说Python适合从网上下载页面或者作为
之间的粘合剂子系统,或做大量字符串处理,
等,但对于像这样的咕噜声低级作品,它通常太慢,
你可以使用其他低级语言。
- 您可以使用已编写的lib,或使用扩展名,例如
您可以尝试使用ShedSkin或Pyd。
再见,
熊宝宝
10月15日,10:19 * pm,Michele< mich ...... @ nectarine.itwrote:
我正在尝试编码字节数据。让我们不要关注
编码的过程;事实上,我想强调的是,方法
create_random_block需要0.5秒才能执行(即使Java更快)
a双核3.0Ghz机器:
花了46.746999979s,平均值:0.46746999979s
因此我认为字节之间的xor操作会提升执行力
时间到0.5;为什么我认为呢?
因为在Python中没有字节的支持,甚至对于xoring
字节,所以我使用了一种解决方法:
我在两个字符串上循环为xor-red
* *对于字符串中的每个字符串
* * * *在整数上转换一个字符然后xor它们; (ord)
* * * *在结果中插入一个字符,转换前一个整数
in char(chr)
我认为ord()和char()是我实现的b / b
的主要问题,但我不知道x或两个字节的数据是什么方式
(字节表示为字符串)。
有关详细信息,请参阅附带的代码。
我应该如何减少执行时间?
谢谢
来自__future__进口部门的
导入随机
导入时间
import sha
import os
class Encoder(object):
* * def create_random_block(self,data, seed,blocksize):
* * * * number_of_blocks = int(len(data)/ blocksize)
* * * * random.seed(种子)
* * * * random_block = [''0''] * blocksize
你可能的意思是''\0''即字节全部其位为零。
* * * *表示范围内的索引(number_of_blocks):
* * * * * * if int(random.getrandbits(1))== 1 :
getrandbits(1)产生一个带有一个随机位的* long *。有什么好的理由
更喜欢这个randrange(2)和randint(0,1)?
所以这个块有50%的可能性将被删除到结果中;
是你想要的?
* * * * * * * * block = data [blocksize * index:blocksize * index + blocksize]
你不需要切块,当然不是那么笨拙。
< blockquote class =post_quotes>
* * * * * * * *用于范围内的位(len(块)):
也许你的意思是& ; byte_index",not notbit。
假设我的范围(len(块))是不变的:计算一次
。这个假设是不正确的,你的代码用于计算
的块数;它最后忽略了一个可能的短块。
* * * * * * * * * * random_block [bit] =
chr(ord(random_block [bit])^ ord(block [bit]))
chr()和一个ord()完全是浪费;将random_block保留为
的整数列表,并在return语句中执行chr()事项。
#factaround per fare xor
位有点差; xor e''solo supportato per int -ord
* * * * return''''。join(random_block)
这将成为
返回''''。join(map(chr,random_block))
或
返回''''。join(chr (i)因为我在random_block)
因为口味或速度决定:-)
所以整个事情变成[未经测试]:
def create_random_block(self,data,seed,blocksize):
datalen = len(数据)
断言datalen%blocksize == 0
random.seed(种子)
random_block = [0] * blocksize
block_range = range(blocksize)
在xrange开始(0,datalen,blocksize):
if random.randrange(2):
for block in block_range:
random_block [x] ^ = ord(数据[start + x])
返回''''。join(map(chr,random_block))
看起来比运动员更有运动性之前:-)
BTW,+1误导本周的主题;这不是XOR'那么慢!!
干杯,
John
Hi,
I''m trying to encode a byte data. Let''s not focus on the process of
encoding; in fact, I want to emphasize that the method
create_random_block takes 0.5s to be executed (even Java it''s faster) on
a Dual-Core 3.0Ghz machine:
took 46.746999979s, avg: 0.46746999979s
Thus I suppose that the xor operation between bytes raise the execution
time to 0.5; why I suppose that?
Because in Python there''s no support for bytes and even for xoring
bytes, so I used a workaround:
I cycle on the two strings to be xor-red
for every char in the strings
convert one char on integer and then xor them; (ord)
insert one char in the result, transforming the previous integer
in char (chr)
I suppose that ord() and char() are the main problems of my
implementation, but I don''t know either way to xor two bytes of data
(bytes are represented as strings).
For more information, see the code attached.
How should I decrease the execution time?
Thank you
from __future__ import division
import random
import time
import sha
import os
class Encoder(object):
def create_random_block(self, data, seed, blocksize):
number_of_blocks = int(len(data)/blocksize)
random.seed(seed)
random_block = [''0''] * blocksize
for index in range(number_of_blocks):
if int(random.getrandbits(1)) == 1:
block = data[blocksize*index:blocksize*index+blocksize]
for bit in range(len(block)):
random_block[bit] =
chr(ord(random_block[bit])^ord(block[bit])) # workaround per fare xor
bit a bit di str; xor e'' solo supportato per int -ord
return ''''.join(random_block)
x = Encoder()
piece = os.urandom(1024*1024)
blocksize = 16384
t1 = time.time()
for l in range(100):
seed = random.getrandbits(32)
block = x.create_random_block(piece, seed, blocksize)
t2 = time.time()
print ''took '' + str(t2-t1) + ''s, avg: '' + str((t2-t1)/100.0) + ''s''
My answer is: never do things like this with python.
You will find this module useful: www.pycrypto.org
On Oct 15, 12:19*pm, Michele <mich...@nectarine.itwrote:Hi,
I''m trying to encode a byte data. Let''s not focus on the process of
encoding; in fact, I want to emphasize that the method
create_random_block takes 0.5s to be executed (even Java it''s faster) on
a Dual-Core 3.0Ghz machine:
took 46.746999979s, avg: 0.46746999979s
Thus I suppose that the xor operation between bytes raise the execution
time to 0.5; why I suppose that?
Because in Python there''s no support for bytes and even for xoring
bytes, so I used a workaround:
I cycle on the two strings to be xor-red
* * for every char in the strings
* * * * convert one char on integer and then xor them; (ord)
* * * * insert one char in the result, transforming the previous integer
in char (chr)
I suppose that ord() and char() are the main problems of my
implementation, but I don''t know either way to xor two bytes of data
(bytes are represented as strings).
For more information, see the code attached.
How should I decrease the execution time?
Thank you
from __future__ import division
import random
import time
import sha
import os
class Encoder(object):
* * def create_random_block(self, data, seed, blocksize):
* * * * number_of_blocks = int(len(data)/blocksize)
* * * * random.seed(seed)
* * * * random_block = [''0''] * blocksize
* * * * for index in range(number_of_blocks):
* * * * * * if int(random.getrandbits(1)) == 1:
* * * * * * * * block = data[blocksize*index:blocksize*index+blocksize]
* * * * * * * * for bit in range(len(block)):
* * * * * * * * * * random_block[bit] =
chr(ord(random_block[bit])^ord(block[bit])) # workaround per fare xor
bit a bit di str; xor e'' solo supportato per int -ord
* * * * return ''''.join(random_block)
x = Encoder()
piece = os.urandom(1024*1024)
blocksize = 16384
t1 = time.time()
for l in range(100):
* * seed = random.getrandbits(32)
* * block = x.create_random_block(piece, seed, blocksize)
t2 = time.time()
print ''took '' + str(t2-t1) + ''s, avg: '' + str((t2-t1)/100.0) + ''s''
Few suggestions for your code:
- Use xrange instead of range.
- Loop over lists where you can instead of their indexes.
- array.array("B", somestring) may help you because it gives a byte
"view" of a string.
- Using psyco helps a lot for such kind of code.
- I think numpy arrays can contain text/chars too, so it may offer you
ways to speed up your code a lot.
- Generally Python is fit to download pages from the net or to act as
glue between different subsystems, or to do bulk string processing,
etc, but for grunt low-level works like this it''s often too much slow,
and you can use other lower-level languages.
- You can use a lib already written, or use an extension, for example
you can try ShedSkin, or Pyd.
Bye,
bearophile
On Oct 15, 10:19*pm, Michele <mich...@nectarine.itwrote:Hi,
I''m trying to encode a byte data. Let''s not focus on the process of
encoding; in fact, I want to emphasize that the method
create_random_block takes 0.5s to be executed (even Java it''s faster) on
a Dual-Core 3.0Ghz machine:
took 46.746999979s, avg: 0.46746999979s
Thus I suppose that the xor operation between bytes raise the execution
time to 0.5; why I suppose that?
Because in Python there''s no support for bytes and even for xoring
bytes, so I used a workaround:
I cycle on the two strings to be xor-red
* * for every char in the strings
* * * * convert one char on integer and then xor them; (ord)
* * * * insert one char in the result, transforming the previous integer
in char (chr)
I suppose that ord() and char() are the main problems of my
implementation, but I don''t know either way to xor two bytes of data
(bytes are represented as strings).
For more information, see the code attached.
How should I decrease the execution time?
Thank you
from __future__ import division
import random
import time
import sha
import os
class Encoder(object):
* * def create_random_block(self, data, seed, blocksize):
* * * * number_of_blocks = int(len(data)/blocksize)
* * * * random.seed(seed)
* * * * random_block = [''0''] * blocksizeYou possibly mean ''\0'' i.e. the byte all of whose bits are zero.
* * * * for index in range(number_of_blocks):
* * * * * * if int(random.getrandbits(1)) == 1:getrandbits(1) produces a *long* with one random bit. Any good reason
for preferring this to randrange(2) and randint(0, 1)?
So there''s a 50% chance that this block will be XORed into the result;
is that what you intend?
* * * * * * * * block = data[blocksize*index:blocksize*index+blocksize]You don''t need to slice out block, certainly not so awkwardly.
* * * * * * * * for bit in range(len(block)):
Perhaps you mean "byte_index", not "bit".
On my assumption that range(len(block)) is invariant: calculate it
once. That assumption is incorrect, so is your code for calculating
the number of blocks; it ignores a possible short block at the end.
* * * * * * * * * * random_block[bit] =
chr(ord(random_block[bit])^ord(block[bit]))The chr() and one ord() are utterly wasteful; leave random_block as a
list of ints and do the chr() thing in the return statement.
# workaround per fare xor
bit a bit di str; xor e'' solo supportato per int -ord
* * * * return ''''.join(random_block)this will become
return ''''.join(map(chr, random_block))
or
return ''''.join(chr(i) for i in random_block)
as taste or speed dictates :-)
So the whole thing becomes [not tested]:
def create_random_block(self, data, seed, blocksize):
datalen = len(data)
assert datalen % blocksize == 0
random.seed(seed)
random_block = [0] * blocksize
block_range = range(blocksize)
for start in xrange(0, datalen, blocksize):
if random.randrange(2):
for x in block_range:
random_block[x] ^= ord(data[start + x])
return ''''.join(map(chr, random_block))
Looks slightly more athletic than before :-)
BTW, +1 misleading subject of the week; it''s not XOR that''s slow!!
Cheers,
John
这篇关于xor:怎么这么慢?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!