使用基于ID属性的模式替换NA [英] Replace NA with mode based on ID attribute

查看:87
本文介绍了使用基于ID属性的模式替换NA的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个数据集 dt ,我想用模式替换 NA

I have a dataset dt and I want to replace the NA values with the mode of each attribute based on the id as follow:

之前:

 id  att  
  1  v
  1  v
  1  NA
  1  c
  2  c
  2  v
  2  NA
  2  c

我要寻找的结果是:

 id  att
  1  v
  1  v
  1  v
  1  c
  2  c
  2  v
  2  c
  2  c

例如,我做了一些尝试我发现了另一个类似的问题,想用 mean (具有内置功能)替换NA,因此我尝试如下调整代码:

I have done some attempts for example I found another similar question which wanted to replace the NA with mean (which has a built in function), therefore I tried to adjust the code as follow:

for (i in 1:dim(dt)[1]) {
    if (is.na(dt$att[i])) {
      att_mode <-                  # I am stuck here to return the mode of an attribute based on ID
      dt$att[i] <- att_mode 
    }
  }

我发现以下函数可以计算模式

I found the following function to calculate the mode

Mode <- function(x) {
  ux <- unique(x)
  ux[which.max(tabulate(match(x, ux)))]
}

链接:是否有内置函数可用于查找

但是我不知道如何在for循环中应用它,我尝试了apply,ave函数,但是它们似乎不正确选择是因为尺寸不同。

But I have no idea how to apply it inside the for loop, I tried apply, ave functions but they do not seem to be the right choice because of the different dimensions.

有人可以帮忙在我的for循环中返回该模式吗?

Could anyone help on how to return the mode in my for loop?

谢谢

推荐答案

我们可以使用库中的 na.aggrgate (动物园),将 FUN 指定为 Mode 。如果是按操作分组,则可以使用 data.table 进行操作。将'data.frame'转换为'data.table'( setDT(df1)),按'id'分组,我们应用 na。总计

We can use na.aggrgate from library(zoo), specify the FUN as Mode. If this is a group by operation, we can do this using data.table. Convert the 'data.frame' to 'data.table' (setDT(df1)), grouped by 'id', we apply the na.aggregate

library(data.table)
library(zoo)
setDT(df1)[, att:= na.aggregate(att, FUN=Mode), by = id]
df1
#    id att
#1:  1   v
#2:  1   v
#3:  1   v
#4:  1   c
#5:  2   c
#6:  2   v
#7:  2   c
#8:  2   c






A类似 dplyr

library(dplyr)
df1 %>%
     group_by(id) %>%
     mutate(att = na.aggregate(att, FUN=Mode))

注意:OP帖子中的 Mode 。另外,假设 att是个字符类。

NOTE: Mode from OP's post. Also, assuming that the 'att' is character class.

这篇关于使用基于ID属性的模式替换NA的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆