将 pandas 数据框转换为结构化数组 [英] Converting pandas dataframe to structured arrays
本文介绍了将 pandas 数据框转换为结构化数组的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!
问题描述
我有以下pandas
数据帧
import pandas as pd
a = [2.5,3.3]
b = [3.6,3.9]
D = {'A': a, 'B': b}
这给了我类似的东西
+---+-----+-----+
| | A | B |
+---+-----+-----+
| 0 | 2.5 | 3.3 |
| 1 | 3.6 | 3.9 |
+---+-----+-----+
我想将此数据帧转换为类似结构的数组
I want to convert this dataframe to a structured array like
data = np.rec.array([
('A', 2.5),
('A', 3.6),
('B', 3.3),
('B', 3.9),
], dtype = [('Type','|U5'),('Value', '<i8')])
由于我是熊猫新手,所以我没有找到实现这一目标的方法.我尝试了pd.to_records
,但是索引遇到了麻烦,我找不到解决办法.
I failed to find a way to make this happen since I'm new to pandas. I tried pd.to_records
but the index is getting in the way and I cannot find a way around that.
感谢您的帮助.谢谢.
推荐答案
Melt the DataFrame to make A
and B
(the column index) into a column.
To get rid of the numeric index, make this new column the index. Then call to_records()
:
import pandas as pd
a = [2.5,3.3]
b = [3.6,3.9]
D = {'A': a, 'B': b}
df = pd.DataFrame(D)
result = (pd.melt(df, var_name='Type', value_name='Value')
.set_index('Type').to_records())
print(repr(result))
收益
rec.array([('A', 2.5), ('A', 3.3), ('B', 3.6), ('B', 3.9)],
dtype=[('Type', 'O'), ('Value', '<f8')])
这是关键步骤:
This is the key step:
In [167]: df
Out[167]:
A B
0 2.5 3.6
1 3.3 3.9
In [168]: pd.melt(df)
Out[168]:
variable value
0 A 2.5
1 A 3.3
2 B 3.6
3 B 3.9
一旦融化了DataFrame,to_records
(基本上)将返回所需的结果:
Once you've melted the DataFrame, to_records
(basically) returns the desired result:
In [169]: pd.melt(df).to_records()
Out[169]:
rec.array([(0, 'A', 2.5), (1, 'A', 3.3), (2, 'B', 3.6), (3, 'B', 3.9)],
dtype=[('index', '<i8'), ('variable', 'O'), ('value', '<f8')])
这篇关于将 pandas 数据框转换为结构化数组的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!
查看全文