如何在Python3中解码编码的文字/字符串的numpy数组?AttributeError:"numpy.ndarray"对象没有属性"decode" [英] How to decode a numpy array of encoded literals/strings in Python3? AttributeError: 'numpy.ndarray' object has no attribute 'decode'

查看:178
本文介绍了如何在Python3中解码编码的文字/字符串的numpy数组?AttributeError:"numpy.ndarray"对象没有属性"decode"的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

在Python 3中,我有以下 NumPy 个由 strings 组成的数组.

In Python 3, I have the follow NumPy array of strings.

NumPy 数组中的每个 string 的格式为 b'MD18EE 而不是 MD18EE .

Each string in the NumPy array is in the form b'MD18EE instead of MD18EE.

例如:

import numpy as np
print(array1)
(b'first_element', b'element',...)

通常,人们会使用 .decode('UTF-8')来解码这些元素.

Normally, one would use .decode('UTF-8') to decode these elements.

但是,如果我尝试:

array1 = array1.decode('UTF-8')

我收到以下错误:

AttributeError: 'numpy.ndarray' object has no attribute 'decode'

如何从 NumPy 数组解码这些元素?(也就是说,我不要 b'')

How do I decode these elements from a NumPy array? (That is, I don't want b'')

假设我正在处理 Pandas DataFrame ,其中只有某些以这种方式编码的列.例如:

Let's say I was dealing with a Pandas DataFrame with only certain columns that were encoded in this manner. For example:

import pandas as pd
df = pd.DataFrame(...)

df
        COL1          ....
0   b'entry1'         ...
1   b'entry2'
2   b'entry3'
3   b'entry4'
4   b'entry5'
5   b'entry6'

推荐答案

您有一个字节字符串数组;dtype是 S :

You have an array of bytestrings; dtype is S:

In [338]: arr=np.array((b'first_element', b'element'))
In [339]: arr
Out[339]: 
array([b'first_element', b'element'], 
      dtype='|S13')

astype 轻松地将它们转换为unicode,这是Py3的默认字符串类型.

astype easily converts them to unicode, the default string type for Py3.

In [340]: arr.astype('U13')
Out[340]: 
array(['first_element', 'element'], 
      dtype='<U13')

还有一个字符串函数库-将相应的 str 方法应用于字符串数组的元素

There is also a library of string functions - applying the corresponding str method to the elements of a string array

In [341]: np.char.decode(arr)
Out[341]: 
array(['first_element', 'element'], 
      dtype='<U13')

astype 更快,但是 decode 允许您指定编码.

The astype is faster, but the decode lets you specify an encoding.

另请参见如何解码numpy数组dtype = numpy.string_?

这篇关于如何在Python3中解码编码的文字/字符串的numpy数组?AttributeError:"numpy.ndarray"对象没有属性"decode"的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆