连接不同长度的numpy数组的字典(如果可能,避免手动循环) [英] Concatenating dictionaries of numpy arrays of different lengths (avoiding manual loops if possible)

查看:1469
本文介绍了连接不同长度的numpy数组的字典(如果可能,避免手动循环)的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

?我有一个类似于这里讨论的问题
连接numpy数组的字典(避免手动循环,如果可能)

I have a question similar to the one discussed here Concatenating dictionaries of numpy arrays (avoiding manual loops if possible)

我正在寻找一种连接值的方式两个python字典包含任意大小的numpy数组,同时避免手动循环使用字典键。例如:

I am looking for a way to concatenate the values in two python dictionaries that contain numpy arrays of arbitrary size whilst avoiding having to manually loop over the dictionary keys. For example:

import numpy as np

# Create first dictionary
n1 = 3
s = np.random.randint(1,101,n1)
n2 = 2
r = np.random.rand(n2)
d = {"r":r,"s":s}
print "d = ",d

# Create second dictionary
n3 = 1
s = np.random.randint(1,101,n3)
n4 = 3
r = np.random.rand(n4)
d2 = {"r":r,"s":s}
print "d2 = ",d2

# Some operation to combine the two dictionaries...
d = SomeOperation(d,d2)

# Updated dictionary
print "d3 = ",d

给出输出

>> d =  {'s': array([75, 25, 88]), 'r': array([ 0.1021227 ,  0.99454874])}
>> d2 =  {'s': array([78]), 'r': array([ 0.27610587,  0.57037473, 0.59876391])}
>> d3 =  {'s': array([75, 25, 88, 78]), 'r': array([ 0.1021227 ,  0.99454874, 0.27610587,  0.57037473, 0.59876391])}

ie所以如果密钥已经存在,则存放在该密钥下的numpy数组被附加到。

i.e. so that if the key already exists, the numpy array stored under that key is appended to.

前面讨论中使用包大熊猫提出的解决方案不起作用,因为它需要具有相同长度的数组(n1 = n2和n3 = n4)。

The solution proposed in the previous discussion using the package pandas does not work as it requires arrays having the same length (n1=n2 and n3=n4).

有没有人知道最好的方式来做到这一点,同时最大限度地减少使用慢,手动循环? (我想避免循环,因为我想要组合的字典可以有数百个键)。

Does anybody know the best way to do this, whilst minimising the use of slow, manual for loops? (I would like to avoid loops because the dictionaries I would like to combine could have hundreds of keys).

感谢(也是为了Aim来制定一个非常明确的问题)!

Thanks (also to "Aim" for formulating a very clear question)!

推荐答案

一种方法是使用系列字典(即值为串行而不是数组):

One way is to go is use a dictionary of Series (i.e. the values are Series rather than arrays):

In [11]: d2
Out[11]: {'r': array([ 0.3536318 ,  0.29363604,  0.91307454]), 's': array([46])}

In [12]: d2 = {name: pd.Series(arr) for name, arr in d2.iteritems()}

In [13]: d2
Out[13]:
{'r': 0    0.353632
1    0.293636
2    0.913075
dtype: float64,
 's': 0    46
dtype: int64}

您可以将其传递到DataFrame构造函数中:

That way you can pass it into the DataFrame constructor:

In [14]: pd.DataFrame(d2)
Out[14]:
          r   s
0  0.353632  46
1  0.293636 NaN
2  0.913075 NaN

这篇关于连接不同长度的numpy数组的字典(如果可能,避免手动循环)的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆