合并日期时间列上的数据(POSIXct 格式) [英] Merging Data on date time column (POSIXct format)

查看:76
本文介绍了合并日期时间列上的数据(POSIXct 格式)的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我想合并日期时间列上的两个数据框 dtype.date-time 列包含相似和不同的值.但我无法合并它们,以便所有唯一的日期时间行最终都在那里......在不常见的列中使用 NA.我在第二个数据框的 date_time 列中获取 NA.在 R 和 python 中都尝试过

I want to merge two data frames on Date Time column dtype.date-time columns contain both similar and different values. But I am unable to merge them such that all unique date-time rows are finally there..with NA in uncommon columns. I am getting NAs in date_time column for 2nd data frame. tried both in R and python

df=pd.merge(df_met, df_so2, how='left', on='Date_Time')

在 R..data_type 中是使用 as.POSIXct 的日期时间

df_2<-join(so2, met_km, type="inner")
df3 <- merge(so2, met_km, all = TRUE)
df_4 <- merge(so2, met_km, by.x = "Date_Time", by.y = "Date_Time")

df_so2:

 X  POC  Datum        Date_Time          Date_GMT  Sample.Measurement  MDL
 1    2  WGS84  2015-01-01 3:00  01/01/2015 09:00                 2.3  0.2
 2    2  WGS84  2015-01-01 4:00  01/01/2015 10:00                 2.5  0.2
 3    2  WGS84  2015-01-01 5:00  01/01/2015 11:00                 2.1  0.2
 4    2  WGS84  2015-01-01 6:00  01/01/2015 12:00                 2.3  0.2
 5    2  WGS84  2015-01-01 7:00  01/01/2015 13:00                 1.1  0.2

df_met:

 X        Date_Time  air_temp_set_1  dew_point_temperature_set_1
 1  2015-01-01 1:00            35.6                         35.6
 2  2015-01-01 2:00            35.6                         35.6
 3  2015-01-01 3:00            35.6                         35.6
 4  2015-01-01 4:00            33.8                         33.8
 5  2015-01-01 5:00            33.2                         33.2
 6  2015-01-01 6:00            33.8                         33.8
 7  2015-01-01 7:00            33.8                         33.8

预期输出:

 X  POC    Datum        Date_Time          Date_GMT  Sample.Measurement  MDL
 1  1.0  2 WGS84  2015-01-01 3:00  01/01/2015 09:00                 2.3  0.2
 2  2.0  2 WGS84  2015-01-01 4:00  01/01/2015 10:00                 2.5  0.2
 3  NaN      NaN  2015-01-01 1:00               NaN                 NaN  NaN
 4  NaN      NaN  2015-01-01 2:00               NaN                 NaN  NaN

推荐答案

merge on outer 应该得到他们所有:

  • pandas.DataFrame.合并
  • outer:使用来自两个帧的键的并集,类似于 SQL 全外连接;按字典顺序对键进行排序.
  • 根据您的评论,您需要所有日期,而不仅仅是预期输出
  • 中显示的日期
  • 添加参数sort=True,如果你想让它们按date
  • 排序

    merge on outer should get them all:

    • pandas.DataFrame.merge
    • outer: use union of keys from both frames, similar to a SQL full outer join; sort keys lexicographically.
    • based upon your comment, you want all the dates, not just those shown in Expected Output
    • add the parameter, sort=True if you want them sorted by date
    • df_exp = pd.merge(df_so2, df_met, on='Date_Time', how='outer')
      
       X_x  POC  Datum        Date_Time          Date_GMT  Sample.Measurement  MDL  X_y  air_temp_set_1  dew_point_temperature_set_1
       1.0  2.0  WGS84  2015-01-01 3:00  01/01/2015 09:00                 2.3  0.2    3            35.6                         35.6
       2.0  2.0  WGS84  2015-01-01 4:00  01/01/2015 10:00                 2.5  0.2    4            33.8                         33.8
       3.0  2.0  WGS84  2015-01-01 5:00  01/01/2015 11:00                 2.1  0.2    5            33.2                         33.2
       4.0  2.0  WGS84  2015-01-01 6:00  01/01/2015 12:00                 2.3  0.2    6            33.8                         33.8
       5.0  2.0  WGS84  2015-01-01 7:00  01/01/2015 13:00                 1.1  0.2    7            33.8                         33.8
       NaN  NaN    NaN  2015-01-01 1:00               NaN                 NaN  NaN    1            35.6                         35.6
       NaN  NaN    NaN  2015-01-01 2:00               NaN                 NaN  NaN    2            35.6                         35.6
      

      没有来自 df_met 的列:

      df_exp.drop(columns=['X_y', 'air_temp_set_1', 'dew_point_temperature_set_1'], inplace=True)
      df_exp.rename(columns={'X_x': 'X'}, inplace=True)
      
         X  POC  Datum        Date_Time          Date_GMT  Sample.Measurement  MDL
       1.0  2.0  WGS84  2015-01-01 3:00  01/01/2015 09:00                 2.3  0.2
       2.0  2.0  WGS84  2015-01-01 4:00  01/01/2015 10:00                 2.5  0.2
       3.0  2.0  WGS84  2015-01-01 5:00  01/01/2015 11:00                 2.1  0.2
       4.0  2.0  WGS84  2015-01-01 6:00  01/01/2015 12:00                 2.3  0.2
       5.0  2.0  WGS84  2015-01-01 7:00  01/01/2015 13:00                 1.1  0.2
       NaN  NaN    NaN  2015-01-01 1:00               NaN                 NaN  NaN
       NaN  NaN    NaN  2015-01-01 2:00               NaN                 NaN  NaN
      

      这篇关于合并日期时间列上的数据(POSIXct 格式)的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆