xor:怎么这么慢? [英] xor: how come so slow?

查看:66
本文介绍了xor:怎么这么慢?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述



我正在尝试编码字节数据。让我们不要关注

编码的过程;事实上,我想强调的是,方法

create_random_block需要0.5秒才能执行(即使Java更快)

a双核3.0Ghz机器:


花了46.746999979s,平均值:0.46746999979s


因此我认为字节之间的xor操作会提升执行力

时间到0.5;为什么我认为呢?

因为在Python中没有字节的支持,甚至对于xoring

字节,所以我使用了一种解决方法:

我在两个字符串上循环为xor-red

为字符串中的每个字符串

在整数上转换一个字符然后xor它们; (ord)

在结果中插入一个char,转换前一个整数

in char(chr)


我想这个ord()和char()是我的

实现的主要问题,但我不知道xor两个字节的数据是什么方式

(字节代表作为字符串)。

有关详细信息,请参阅附带的代码。


我应该如何减少执行时间?


谢谢
来自__future__进口部门的


导入随机

导入时间

import sha

导入os


类编码器(对象):

def create_random_block(self,data,seed,blocksize):

number_of_blocks = int(len(data)/ blocksize)

random.seed(种子)

random_block = [''0''] * blocksize
表示范围内的索引(number_of_blocks):

if int(random.getrandbits(1))== 1:

block = data [blocksize * index :块大小*索引+块大小]

为范围内的位(len(块)):

random_block [bit] =

chr(ord(random_block [bit]) ^ ord(块[bit]))#factaround per fare xor

bit a bit di str; xor e''solo supportato per int -ord

return''''。join(random_block)

x = Encoder()

piece = os.urandom(1024 * 1024)

blocksize = 16384

t1 = time.time()

for l in range(100):

seed = random.getrandbits(32)

block = x.create_random_block(piece,seed,blocksize)

t2 = time.time( )

print''take''+ str(t2-t1)+'s,avg:''+ str((t2-t1)/100.0)+'s'''br />

解决方案

我的回答是:永远不要用python做这样的事情。

你会发现这个模块很有用: www.pycrypto.org


10月15日,12:19 * pm,Michele< mich ... @ nectarine.itwrote:




我正在尝试编码一个字节数据。让我们不要关注

编码的过程;事实上,我想强调的是,方法

create_random_block需要0.5秒才能执行(即使Java更快)

a双核3.0Ghz机器:


花了46.746999979s,平均值:0.46746999979s


因此我认为字节之间的xor操作会提升执行力

时间到0.5;为什么我认为呢?

因为在Python中没有字节的支持,甚至对于xoring

字节,所以我使用了一种解决方法:

我在两个字符串上循环为xor-red

* *对于字符串中的每个字符串

* * * *在整数上转换一个字符然后xor它们; (ord)

* * * *在结果中插入一个字符,转换前一个整数

in char(chr)


我认为ord()和char()是我实现的b / b
的主要问题,但我不知道x或两个字节的数据是什么方式

(字节表示为字符串)。

有关详细信息,请参阅附带的代码。


我应该如何减少执行时间?


谢谢

来自__future__进口部门的


导入随机

导入时间

import sha

import os


class Encoder(object):

* * def create_random_block(self,data, seed,blocksize):

* * * * number_of_blocks = int(len(data)/ blocksize)

* * * * random.seed(种子)

* * * * random_block = [''0''] * blocksize

* * * *表示范围内的索引(number_of_blocks):

* * * * * * if int(random.getrandbits(1))== 1:

* * * * * * * * block = data [blocksize * index:blocksize * index + blocksize]

* * * * * * * *为范围内的位数(len(块) ):

* * * * * * * * * * random_block [bit] =

chr(ord(random_block [bit])^ ord(block [bit]) )#factaround per fare xor

bit bit bit str; xor e''solo supportato per int -ord

* * * * return''''。join(random_block)


x = Encoder()

piece = os.urandom(1024 * 1024)

blocksize = 16384

t1 = time.time()

for l in range(100):

* * seed = random.getrandbits(32)

* * block = x.create_random_block(piece,seed,blocksize)

t2 = time.time()

print''take''+ str(t2-t1)+'s,avg:''+ str((t2-t1 )/100.0)+''''


对您的代码的建议很少:

- 使用xrange而不是range。

- 循环列表,你可以代替它们的索引。

- array.array(" B",somestring)可以帮助你,因为它给了一个字节

" view"一个字符串。

- 使用psyco对这类代码有很大的帮助。

- 我认为numpy数组也可以包含text / chars,所以它可能会为你提供

方法可以大大加快你的代码。

- 一般来说Python适合从网上下载页面或者作为

之间的粘合剂子系统,或做大量字符串处理,

等,但对于像这样的咕噜声低级作品,它通常太慢,

你可以使用其他低级语言。

- 您可以使用已编写的lib,或使用扩展名,例如

您可以尝试使用ShedSkin或Pyd。


再见,

熊宝宝


10月15日,10:19 * pm,Michele< mich ...... @ nectarine.itwrote:




我正在尝试编码字节数据。让我们不要关注

编码的过程;事实上,我想强调的是,方法

create_random_block需要0.5秒才能执行(即使Java更快)

a双核3.0Ghz机器:


花了46.746999979s,平均值:0.46746999979s


因此我认为字节之间的xor操作会提升执行力

时间到0.5;为什么我认为呢?

因为在Python中没有字节的支持,甚至对于xoring

字节,所以我使用了一种解决方法:

我在两个字符串上循环为xor-red

* *对于字符串中的每个字符串

* * * *在整数上转换一个字符然后xor它们; (ord)

* * * *在结果中插入一个字符,转换前一个整数

in char(chr)


我认为ord()和char()是我实现的b / b
的主要问题,但我不知道x或两个字节的数据是什么方式

(字节表示为字符串)。

有关详细信息,请参阅附带的代码。


我应该如何减少执行时间?


谢谢

来自__future__进口部门的


导入随机

导入时间

import sha

import os


class Encoder(object):

* * def create_random_block(self,data, seed,blocksize):

* * * * number_of_blocks = int(len(data)/ blocksize)

* * * * random.seed(种子)

* * * * random_block = [''0''] * blocksize



你可能的意思是''\0''即字节全部其位为零。

* * * *表示范围内的索引(number_of_blocks):

* * * * * * if int(random.getrandbits(1))== 1 :



getrandbits(1)产生一个带有一个随机位的* long *。有什么好的理由

更喜欢这个randrange(2)和randint(0,1)?


所以这个块有50%的可能性将被删除到结果中;

是你想要的?


* * * * * * * * block = data [blocksize * index:blocksize * index + blocksize]



你不需要切块,当然不是那么笨拙。

< blockquote class =post_quotes>
* * * * * * * *用于范围内的位(len(块)):



也许你的意思是& ; byte_index",not notbit。


假设我的范围(len(块))是不变的:计算一次

。这个假设是不正确的,你的代码用于计算

的块数;它最后忽略了一个可能的短块。


* * * * * * * * * * random_block [bit] =

chr(ord(random_block [bit])^ ord(block [bit]))



chr()和一个ord()完全是浪费;将random_block保留为
的整数列表,并在return语句中执行chr()事项。


#factaround per fare xor

位有点差; xor e''solo supportato per int -ord

* * * * return''''。join(random_block)



这将成为

返回''''。join(map(chr,random_block))



返回''''。join(chr (i)因为我在random_block)

因为口味或速度决定:-)


所以整个事情变成[未经测试]:

def create_random_block(self,data,seed,blocksize):

datalen = len(数据)

断言datalen%blocksize == 0

random.seed(种子)

random_block = [0] * blocksize

block_range = range(blocksize)

在xrange开始(0,datalen,blocksize):

if random.randrange(2):

for block in block_range:

random_block [x] ^ = ord(数据[start + x])

返回''''。join(map(chr,random_block))


看起来比运动员更有运动性之前:-)


BTW,+1误导本周的主题;这不是XOR'那么慢!!


干杯,

John


Hi,
I''m trying to encode a byte data. Let''s not focus on the process of
encoding; in fact, I want to emphasize that the method
create_random_block takes 0.5s to be executed (even Java it''s faster) on
a Dual-Core 3.0Ghz machine:

took 46.746999979s, avg: 0.46746999979s

Thus I suppose that the xor operation between bytes raise the execution
time to 0.5; why I suppose that?
Because in Python there''s no support for bytes and even for xoring
bytes, so I used a workaround:
I cycle on the two strings to be xor-red
for every char in the strings
convert one char on integer and then xor them; (ord)
insert one char in the result, transforming the previous integer
in char (chr)

I suppose that ord() and char() are the main problems of my
implementation, but I don''t know either way to xor two bytes of data
(bytes are represented as strings).
For more information, see the code attached.

How should I decrease the execution time?

Thank you
from __future__ import division
import random
import time
import sha
import os

class Encoder(object):
def create_random_block(self, data, seed, blocksize):
number_of_blocks = int(len(data)/blocksize)
random.seed(seed)
random_block = [''0''] * blocksize
for index in range(number_of_blocks):
if int(random.getrandbits(1)) == 1:
block = data[blocksize*index:blocksize*index+blocksize]
for bit in range(len(block)):
random_block[bit] =
chr(ord(random_block[bit])^ord(block[bit])) # workaround per fare xor
bit a bit di str; xor e'' solo supportato per int -ord
return ''''.join(random_block)
x = Encoder()
piece = os.urandom(1024*1024)
blocksize = 16384
t1 = time.time()
for l in range(100):
seed = random.getrandbits(32)
block = x.create_random_block(piece, seed, blocksize)
t2 = time.time()
print ''took '' + str(t2-t1) + ''s, avg: '' + str((t2-t1)/100.0) + ''s''

解决方案

My answer is: never do things like this with python.
You will find this module useful: www.pycrypto.org

On Oct 15, 12:19*pm, Michele <mich...@nectarine.itwrote:

Hi,
I''m trying to encode a byte data. Let''s not focus on the process of
encoding; in fact, I want to emphasize that the method
create_random_block takes 0.5s to be executed (even Java it''s faster) on
a Dual-Core 3.0Ghz machine:

took 46.746999979s, avg: 0.46746999979s

Thus I suppose that the xor operation between bytes raise the execution
time to 0.5; why I suppose that?
Because in Python there''s no support for bytes and even for xoring
bytes, so I used a workaround:
I cycle on the two strings to be xor-red
* * for every char in the strings
* * * * convert one char on integer and then xor them; (ord)
* * * * insert one char in the result, transforming the previous integer
in char (chr)

I suppose that ord() and char() are the main problems of my
implementation, but I don''t know either way to xor two bytes of data
(bytes are represented as strings).
For more information, see the code attached.

How should I decrease the execution time?

Thank you

from __future__ import division
import random
import time
import sha
import os

class Encoder(object):
* * def create_random_block(self, data, seed, blocksize):
* * * * number_of_blocks = int(len(data)/blocksize)
* * * * random.seed(seed)
* * * * random_block = [''0''] * blocksize
* * * * for index in range(number_of_blocks):
* * * * * * if int(random.getrandbits(1)) == 1:
* * * * * * * * block = data[blocksize*index:blocksize*index+blocksize]
* * * * * * * * for bit in range(len(block)):
* * * * * * * * * * random_block[bit] =
chr(ord(random_block[bit])^ord(block[bit])) # workaround per fare xor
bit a bit di str; xor e'' solo supportato per int -ord
* * * * return ''''.join(random_block)

x = Encoder()
piece = os.urandom(1024*1024)
blocksize = 16384
t1 = time.time()
for l in range(100):
* * seed = random.getrandbits(32)
* * block = x.create_random_block(piece, seed, blocksize)
t2 = time.time()
print ''took '' + str(t2-t1) + ''s, avg: '' + str((t2-t1)/100.0) + ''s''


Few suggestions for your code:
- Use xrange instead of range.
- Loop over lists where you can instead of their indexes.
- array.array("B", somestring) may help you because it gives a byte
"view" of a string.
- Using psyco helps a lot for such kind of code.
- I think numpy arrays can contain text/chars too, so it may offer you
ways to speed up your code a lot.
- Generally Python is fit to download pages from the net or to act as
glue between different subsystems, or to do bulk string processing,
etc, but for grunt low-level works like this it''s often too much slow,
and you can use other lower-level languages.
- You can use a lib already written, or use an extension, for example
you can try ShedSkin, or Pyd.

Bye,
bearophile


On Oct 15, 10:19*pm, Michele <mich...@nectarine.itwrote:

Hi,
I''m trying to encode a byte data. Let''s not focus on the process of
encoding; in fact, I want to emphasize that the method
create_random_block takes 0.5s to be executed (even Java it''s faster) on
a Dual-Core 3.0Ghz machine:

took 46.746999979s, avg: 0.46746999979s

Thus I suppose that the xor operation between bytes raise the execution
time to 0.5; why I suppose that?
Because in Python there''s no support for bytes and even for xoring
bytes, so I used a workaround:
I cycle on the two strings to be xor-red
* * for every char in the strings
* * * * convert one char on integer and then xor them; (ord)
* * * * insert one char in the result, transforming the previous integer
in char (chr)

I suppose that ord() and char() are the main problems of my
implementation, but I don''t know either way to xor two bytes of data
(bytes are represented as strings).
For more information, see the code attached.

How should I decrease the execution time?

Thank you

from __future__ import division
import random
import time
import sha
import os

class Encoder(object):
* * def create_random_block(self, data, seed, blocksize):
* * * * number_of_blocks = int(len(data)/blocksize)
* * * * random.seed(seed)
* * * * random_block = [''0''] * blocksize

You possibly mean ''\0'' i.e. the byte all of whose bits are zero.

* * * * for index in range(number_of_blocks):
* * * * * * if int(random.getrandbits(1)) == 1:

getrandbits(1) produces a *long* with one random bit. Any good reason
for preferring this to randrange(2) and randint(0, 1)?

So there''s a 50% chance that this block will be XORed into the result;
is that what you intend?

* * * * * * * * block = data[blocksize*index:blocksize*index+blocksize]

You don''t need to slice out block, certainly not so awkwardly.

* * * * * * * * for bit in range(len(block)):


Perhaps you mean "byte_index", not "bit".

On my assumption that range(len(block)) is invariant: calculate it
once. That assumption is incorrect, so is your code for calculating
the number of blocks; it ignores a possible short block at the end.

* * * * * * * * * * random_block[bit] =
chr(ord(random_block[bit])^ord(block[bit]))

The chr() and one ord() are utterly wasteful; leave random_block as a
list of ints and do the chr() thing in the return statement.

# workaround per fare xor
bit a bit di str; xor e'' solo supportato per int -ord
* * * * return ''''.join(random_block)

this will become
return ''''.join(map(chr, random_block))
or
return ''''.join(chr(i) for i in random_block)
as taste or speed dictates :-)

So the whole thing becomes [not tested]:
def create_random_block(self, data, seed, blocksize):
datalen = len(data)
assert datalen % blocksize == 0
random.seed(seed)
random_block = [0] * blocksize
block_range = range(blocksize)
for start in xrange(0, datalen, blocksize):
if random.randrange(2):
for x in block_range:
random_block[x] ^= ord(data[start + x])
return ''''.join(map(chr, random_block))

Looks slightly more athletic than before :-)

BTW, +1 misleading subject of the week; it''s not XOR that''s slow!!

Cheers,
John


这篇关于xor:怎么这么慢?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆