根据日期合并(numpy)数组 [英] Merge (numpy) arrays based on date

查看:138
本文介绍了根据日期合并(numpy)数组的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有N个数组,每个数组的结构如下

I have N arrays each structured as the following

Array 1: [['2014-01-01', '2014-01-03' ...], [1.1, 0.5, ...]]
Array 2: [['2014-01-01', '2014-01-02' ...], [1.4, 0.9, ...]]
Array 3: [['2014-01-02', '2014-01-04' ...], [0.8, 1.5, ...]]

我想获取某种类型的数据框,如下所示:

And I want to get some type of data frame as the following

date            1-data    2-data
2014-01-01      1.1       1.4
2014-01-02      0         0.9
2014-01-03      0.5       0
2014-01-04      0         0

问题出在您身上从示例中可以看到,每个数组都排除了某些日期(即,所有数组中的日期都不相同)。我正在努力寻找一种快速,Python方式将所有数组合并到一个数据框中,并用零填充丢失的数据。

The problem, as you can see from the example, is that some dates are excluded from each array (i.e the dates aren't the same across all of the arrays). I am struggling finding a quick, pythonic way to merge all of my arrays into a dataframe, and filling missing data with zeros.

推荐答案

这应该使用 merge 函数和外部方法

>>> import pandas as pd
>>> import numpy as np
>>> d1 = pd.DataFrame(np.array([['2014-01-01', '2014-01-03'], [1.1, 0.5]])).T
>>> d2 = pd.DataFrame(np.array([['2014-01-01', '2014-01-02'], [1.4, 0.9]])).T
>>> d3 = pd.DataFrame(np.array([['2014-01-02', '2014-01-04'], [0.8, 1.5]])).T
>>> d1.columns = d2.columns = d3.columns = ['t','v']
>>> pd.DataFrame(np.array(d1.merge(d2, on='t', how='outer').
...                          merge(d3, on='t', how='outer').
...                          sort('t')),
...                          columns=['date','1-data','2-data','3-data'])
... 
         date 1-data 2-data 3-data
0  2014-01-01    1.1    1.4    NaN
1  2014-01-02    NaN    0.9    0.8
2  2014-01-03    0.5    NaN    NaN
3  2014-01-04    NaN    NaN    1.5

[4 rows x 4 columns]

这篇关于根据日期合并(numpy)数组的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆