将bitstring numpy数组转换为整数base 2的最快方法 [英] fastest way to convert bitstring numpy array to integer base 2

查看:357
本文介绍了将bitstring numpy数组转换为整数base 2的最快方法的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个由bittrings组成的numpy数组,我打算将bittrings转换为整数base 2,以便执行一些xor按位运算。我可以用python将字符串转换为带有base 2的整数:

I have a numpy array consisting of bitstrings and I intend to convert bitstrings to integer base 2 in order to perform some xor bitwise operations. I can convert string to integer with base 2 in python with this:

int('000011000',2)

我想知道是否有更快更好的方法在numpy中执行此操作。我正在处理的numpy数组的示例是这样的:

I am wondering if there is a faster and better way to do this in numpy. An example of numpy array that I am working on is something like this:

array([['0001'],
       ['0010']], 
      dtype='|S4')

和我希望将其转换为:

array([[1],[2]])


推荐答案

可以使用 np.fromstring 将每个字符串位进入 uint8 键入数字然后使用一些 maths 和矩阵乘法转换/缩减为十进制格式。因此,使用 A 作为输入数组,一种方法就是这样 -

One could use np.fromstring to separate out each of the string bits into uint8 type numerals and then use some maths with matrix-multiplication to convert/reduce to decimal format. Thus, with A as the input array, one approach would be like so -

# Convert each bit of input string to numerals
str2num = (np.fromstring(A, dtype=np.uint8)-48).reshape(-1,4)

# Setup conversion array for binary number to decimal equivalent
de2bi_convarr = 2**np.arange(3,-1,-1)

# Use matrix multiplication for reducing each row of str2num to a single decimal
out = str2num.dot(de2bi_convarr)

示例运行 -

In [113]: A    # Modified to show more variety
Out[113]: 
array([['0001'],
       ['1001'],
       ['1100'],
       ['0010']], 
      dtype='|S4')

In [114]: str2num = (np.fromstring(A, dtype=np.uint8)-48).reshape(-1,4)

In [115]: str2num
Out[115]: 
array([[0, 0, 0, 1],
       [1, 0, 0, 1],
       [1, 1, 0, 0],
       [0, 0, 1, 0]], dtype=uint8)

In [116]: de2bi_convarr = 2**np.arange(3,-1,-1)

In [117]: de2bi_convarr
Out[117]: array([8, 4, 2, 1])

In [118]: out = str2num.dot(de2bi_convarr)

In [119]: out
Out[119]: array([ 1,  9, 12,  2])






可以建议另一种方法来避免 np.fromstring 。使用此方法,我们将在开始时转换为int数据类型,然后将每个数字分开,这应该等同于上一个方法中的 str2num 。其余的代码将保持不变。因此,另一种实现方式是 -


An alternative method could be suggested to avoid np.fromstring. With this method, we would convert to int datatype at the start, then separate out each digit, which should be equivalent of str2num in the previous method. Rest of the code would stay the same. Thus, an alternative implementation would be -

# Convert to int array and thus convert each bit of input string to numerals
str2num = np.remainder(A.astype(np.int)//(10**np.arange(3,-1,-1)),10)

de2bi_convarr = 2**np.arange(3,-1,-1)
out = str2num.dot(de2bi_convarr)






运行时测试

让我们计算到目前为止列出的所有方法来解决问题,包括 @ Kasramvd的循环解决方案

Let's time all the approaches listed thus far to solve the problem, including @Kasramvd's loopy solution.

In [198]: # Setup a huge array of such strings
     ...: A = np.array([['0001'],['1001'],['1100'],['0010']],dtype='|S4')
     ...: A = A.repeat(10000,axis=0)


In [199]: def app1(A):             
     ...:     str2num = (np.fromstring(A, dtype=np.uint8)-48).reshape(-1,4)
     ...:     de2bi_convarr = 2**np.arange(3,-1,-1)
     ...:     out = str2num.dot(de2bi_convarr)    
     ...:     return out
     ...: 
     ...: def app2(A):             
     ...:     str2num = np.remainder(A.astype(np.int)//(10**np.arange(3,-1,-1)),10)
     ...:     de2bi_convarr = 2**np.arange(3,-1,-1)
     ...:     out = str2num.dot(de2bi_convarr)    
     ...:     return out
     ...: 

In [200]: %timeit app1(A)
1000 loops, best of 3: 1.46 ms per loop

In [201]: %timeit app2(A)
10 loops, best of 3: 36.6 ms per loop

In [202]: %timeit np.array([[int(i[0], 2)] for i in A]) # @Kasramvd's solution
10 loops, best of 3: 61.6 ms per loop

这篇关于将bitstring numpy数组转换为整数base 2的最快方法的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆