如何在Python3中解码编码的文字/字符串的numpy数组?AttributeError:"numpy.ndarray"对象没有属性"decode" [英] How to decode a numpy array of encoded literals/strings in Python3? AttributeError: 'numpy.ndarray' object has no attribute 'decode'
问题描述
在Python 3中,我有以下 NumPy
个由 strings
组成的数组.
In Python 3, I have the follow NumPy
array of strings
.
NumPy 数组中的每个 string
的格式为 b'MD18EE
而不是 MD18EE
.
Each string
in the NumPy
array is in the form b'MD18EE
instead of MD18EE
.
例如:
import numpy as np
print(array1)
(b'first_element', b'element',...)
通常,人们会使用 .decode('UTF-8')
来解码这些元素.
Normally, one would use .decode('UTF-8')
to decode these elements.
但是,如果我尝试:
array1 = array1.decode('UTF-8')
我收到以下错误:
AttributeError: 'numpy.ndarray' object has no attribute 'decode'
如何从 NumPy
数组解码这些元素?(也就是说,我不要 b''
)
How do I decode these elements from a NumPy
array? (That is, I don't want b''
)
假设我正在处理 Pandas
DataFrame
,其中只有某些以这种方式编码的列.例如:
Let's say I was dealing with a Pandas
DataFrame
with only certain columns that were encoded in this manner. For example:
import pandas as pd
df = pd.DataFrame(...)
df
COL1 ....
0 b'entry1' ...
1 b'entry2'
2 b'entry3'
3 b'entry4'
4 b'entry5'
5 b'entry6'
推荐答案
您有一个字节字符串数组;dtype是 S
:
You have an array of bytestrings; dtype is S
:
In [338]: arr=np.array((b'first_element', b'element'))
In [339]: arr
Out[339]:
array([b'first_element', b'element'],
dtype='|S13')
astype
轻松地将它们转换为unicode,这是Py3的默认字符串类型.
astype
easily converts them to unicode, the default string type for Py3.
In [340]: arr.astype('U13')
Out[340]:
array(['first_element', 'element'],
dtype='<U13')
还有一个字符串函数库-将相应的 str
方法应用于字符串数组的元素
There is also a library of string functions - applying the corresponding str
method to the elements of a string array
In [341]: np.char.decode(arr)
Out[341]:
array(['first_element', 'element'],
dtype='<U13')
astype
更快,但是 decode
允许您指定编码.
The astype
is faster, but the decode
lets you specify an encoding.
另请参见如何解码numpy数组dtype = numpy.string_?
这篇关于如何在Python3中解码编码的文字/字符串的numpy数组?AttributeError:"numpy.ndarray"对象没有属性"decode"的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!