如何切片 numpy 字符串数组的每个元素? [英] How can I slice each element of a numpy array of strings?

查看：40 发布时间：2021/11/18 2:46:16 python arrays string numpy slice

本文介绍了如何切片 numpy 字符串数组的每个元素?的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

Numpy 有一些非常有用的字符串操作，它们可以向量化通常的 Python 字符串操作.

Numpy has some very useful string operations, which vectorize the usual Python string operations.

与这些操作和 pandas.str 相比，numpy 字符串模块似乎缺少一个非常重要的功能:对数组中的每个字符串进行切片的能力.例如，

Compared to these operation and to pandas.str, the numpy strings module seems to be missing a very important one: the ability to slice each string in the array. For example,

a = numpy.array(['hello', 'how', 'are', 'you'])
numpy.char.sliceStr(a, slice(1, 3))
>>> numpy.array(['el', 'ow', 're' 'ou'])

我是否在具有此功能的模块中遗漏了一些明显的方法?否则，是否有一种快速的矢量化方法来实现这一目标?

Am I missing some obvious method in the module with this functionality? Otherwise, is there a fast vectorized way to achieve this?

推荐答案

这是一种矢量化方法 -

Here's a vectorized approach -

def slicer_vectorized(a,start,end):
    b = a.view((str,1)).reshape(len(a),-1)[:,start:end]
    return np.fromstring(b.tostring(),dtype=(str,end-start))

样品运行 -

In [68]: a = np.array(['hello', 'how', 'are', 'you'])

In [69]: slicer_vectorized(a,1,3)
Out[69]: 
array(['el', 'ow', 're', 'ou'], 
      dtype='|S2')

In [70]: slicer_vectorized(a,0,3)
Out[70]: 
array(['hel', 'how', 'are', 'you'], 
      dtype='|S3')

运行时测试 -

测试其他作者发布的所有方法，我可以在最后运行，还包括本文前面的矢量化方法.

Testing out all the approaches posted by other authors that I could run at my end and also including the vectorized approach from earlier in this post.

这是时间-

In [53]: # Setup input array
    ...: a = np.array(['hello', 'how', 'are', 'you'])
    ...: a = np.repeat(a,10000)
    ...: 

# @Alberto Garcia-Raboso's answer
In [54]: %timeit slicer(1, 3)(a)
10 loops, best of 3: 23.5 ms per loop

# @hapaulj's answer
In [55]: %timeit np.frompyfunc(lambda x:x[1:3],1,1)(a)
100 loops, best of 3: 11.6 ms per loop

# Using loop-comprehension
In [56]: %timeit np.array([i[1:3] for i in a])
100 loops, best of 3: 12.1 ms per loop

# From this post
In [57]: %timeit slicer_vectorized(a,1,3)
1000 loops, best of 3: 787 µs per loop

这篇关于如何切片 numpy 字符串数组的每个元素?的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

如何切片 numpy 字符串数组的每个元素? [英] How can I slice each element of a numpy array of strings?

问题描述

推荐答案

相关文章

Python最新文章

热门教程

热门工具

登录关闭

如何切片 numpy 字符串数组的每个元素? [英] How can I slice each element of a numpy array of strings?

问题描述

推荐答案

相关文章

Python最新文章

热门教程

热门工具

登录 关闭

登录关闭