连接numpy数组的字典(如果可能,避免手动循环) [英] Concatenating dictionaries of numpy arrays (avoiding manual loops if possible)

查看:145
本文介绍了连接numpy数组的字典(如果可能,避免手动循环)的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在寻找一种连接两个包含numpy数组的python字典中的值的方式,同时避免手动循环使用字典键。例如:

I am looking for a way to concatenate the values in two python dictionaries that contain numpy arrays whilst avoiding having to manually loop over the dictionary keys. For example:

import numpy as np

# Create first dictionary
n = 5
s = np.random.randint(1,101,n)
r = np.random.rand(n)
d = {"r":r,"s":s}
print "d = ",d

# Create second dictionary
n = 2
s = np.random.randint(1,101,n)
r = np.random.rand(n)
t = np.array(["a","b"])
d2 = {"r":r,"s":s,"t":t}
print "d2 = ",d2

# Some operation to combine the two dictionaries...
d = SomeOperation(d,d2)

# Updated dictionary
print "d3 = ",d

给出输出

>> d =  {'s': array([75, 25, 88, 54, 82]), 'r': array([ 0.1021227 ,  0.99454874, 0.38680718,  0.98720877,  0.8662894 ])}
>> d2 =  {'s': array([78, 92]), 'r': array([ 0.27610587,  0.57037473]), 't': array(['a', 'b'], dtype='|S1')}
>> d3 =  {'s': array([75, 25, 88, 54, 82, 78, 92]), 'r': array([ 0.1021227 ,  0.99454874, 0.38680718,  0.98720877,  0.8662894, 0.27610587,  0.57037473]), 't': array(['a', 'b'], dtype='|S1')}

即所以如果密钥已经存在,那么存储在该密钥下的numpy数组将被追加到。

i.e. so that if the key already exists, the numpy array stored under that key is appended to.

有没有人知道最好的方式来做到这一点,同时最小化缓慢的使用,循环手册 (我想避免循环,因为我想组合的字典可以有数百个键)。

Does anybody know the best way to do this, whilst minimising the use of slow, manual for loops? (I would like to avoid loops because the dictionaries I would like to combine could have hundreds of keys).

谢谢!

推荐答案

您可以使用大熊猫:

from __future__ import print_function, division
import pandas as pd
import numpy as np

# Create first dictionary
n = 5
s = np.random.randint(1,101,n)
r = np.random.rand(n)
d = {"r":r,"s":s}
df = pd.DataFrame(d)
print(df)

# Create second dictionary
n = 2
s = np.random.randint(1,101,n)
r = np.random.rand(n)
t = np.array(["a","b"])
d2 = {"r":r,"s":s,"t":t}
df2 = pd.DataFrame(d2)
print(df2)

print(pd.concat([df, df2]))

输出:

          r   s
0  0.551402  49
1  0.620870  34
2  0.535525  52
3  0.920922  13
4  0.708109  48
          r   s  t
0  0.231480  43  a
1  0.492576  10  b
          r   s    t
0  0.551402  49  NaN
1  0.620870  34  NaN
2  0.535525  52  NaN
3  0.920922  13  NaN
4  0.708109  48  NaN
0  0.231480  43    a
1  0.492576  10    b

这篇关于连接numpy数组的字典(如果可能,避免手动循环)的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆