将带括号的字符串转换为numpy数组 [英] Convert a string with brackets to numpy array
问题描述
我在dataframe列中有一个像字符串一样的数组结构(我从csv文件中读取了dataframe).
I have an array-like structure in a dataframe column as a string (I read the dataframe from a csv file).
此列的一个字符串元素如下所示:
One string element of this column looks like this:
In [1]: df.iloc[0]['points']
Out [2]: '[(-0.0426, -0.7231, -0.4207), (0.2116, -0.1733, -0.1013), (...)]'
所以它实际上是一个类似数组的结构,对我来说看起来已经为numpy准备好了".
so it's really an array-like structure, which looks 'ready for numpy' to me.
numpy.fromstring()
没有帮助,因为它不喜欢方括号:
在python中将数组的字符串表示转换为numpy数组
numpy.fromstring()
doesn't help as it doesn't like brackets:
convert string representation of array to numpy array in python
如果我将其复制并粘贴到array()
函数中,则字符串本身上的简单numpy.array()
将返回一个numpy数组.
但是,如果我用包含如下字符串的变量填充array()
函数:np.array(df.iloc[0]['points'])
它不起作用,给我ValueError: could not convert string to float
A simple numpy.array()
on the string itself, if I copy and paste it in the array()
function is returning me a numpy array.
But if I fill the array()
function with the variable containing the string like that: np.array(df.iloc[0]['points'])
it does not work, giving me a ValueError: could not convert string to float
是否有任何功能可以以简单的方式做到这一点(无需更换或重新放置括号)?
Is there any function to do that in a simple way (without replacing or regex-ing the brackets)?
推荐答案
在传递给numpy.array
之前,可以使用ast.literal_eval
:
You can use ast.literal_eval
before passing to numpy.array
:
from ast import literal_eval
import numpy as np
x = '[(-0.0426, -0.7231, -0.4207), (0.2116, -0.1733, -0.1013)]'
res = np.array(literal_eval(x))
print(res)
array([[-0.0426, -0.7231, -0.4207],
[ 0.2116, -0.1733, -0.1013]])
您可以对Pandas系列中的字符串进行等效处理,但是尚不清楚是否需要跨行聚合.在这种情况下,您可以合并使用上述逻辑派生的NumPy数组的列表.
You can do the equivalent with strings in a Pandas series, but it's not clear if you need to aggregate across rows. If this is the case, you can combine a list of NumPy arrays derived using the above logic.
文档解释了 literal_eval
可接受的类型:
The docs explain types acceptable to literal_eval
:
安全地评估表达式节点或包含Python的字符串 文字或容器显示.提供的字符串或节点只能 由以下Python文字结构组成:字符串,字节, 数字,元组,列表,字典,集合,布尔值和
None
.
Safely evaluate an expression node or a string containing a Python literal or container display. The string or node provided may only consist of the following Python literal structures: strings, bytes, numbers, tuples, lists, dicts, sets, booleans, and
None
.
因此,我们正在有效地将字符串转换为元组列表,然后np.array
然后可以将其转换为NumPy数组.
So we are effectively converting a string to a list of tuples, which np.array
can then convert to a NumPy array.
这篇关于将带括号的字符串转换为numpy数组的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!