如何完成数据框中的缺失因子水平? [英] How to complete missing factor levels in data frame?

查看:56
本文介绍了如何完成数据框中的缺失因子水平?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

假装我有这样的东西:

df <- data.frame(
      PERSON = c("Peter", "Peter", "Marcel" , "Lisa", "Lisa"),        
      FRUIT = c("Apple", "Peach","Apple", "Apple", "Peach" ), 
      A = c(100, 200, 100, 200, 300), 
      B=c(1,2,3,4,5) )
df$PERSON <- as.factor(df$Person)
df$FRUIT <- factor(df$FRUIT, levels = c("Apple", "Peach", "Coconut"))

有哪些结果

str(df): 'data.frame':  5 obs. of  4 variables:
$ PERSON: Factor w/ 3 levels "Lisa","Marcel",..: 3 3 2 1 1
$ FRUIT : Factor w/ 3 levels "Apple","Peach",..: 1 2 1 1 2
$ A     : num  100 200 100 200 300
$ B     : num  1 2 3 4 5

我想扩展此数据的框架,以便每个人都有所有级别的水果,例如:

I want to expand this data, frame so that for every PERSON there are all levels of FRUIT present, like this:

 Person FRUIT   A B
1  Peter Apple 100 1
2  Peter Peach 200 2
3  Peter Coconut 0 0
4 Marcel Apple 100 3
5 Marcel Peach 0 0
6 Marcel Coconut 0 0
7   Lisa Apple 200 4
8   Lisa Peach 300 5
9   Lisa Coconut 0 0

A B 的缺失值应该用0填充。

Missing values for A and B should be filled with 0.

我尝试了 tidyr :: complete(df $ FRUIT,0),但看来,我

推荐答案

complete 将第一个参数用作数据,后跟t他专栏扩大。默认情况下,填充是NA,但是我们可以通过在列表中指定它来将其更改为0。

The complete takes the first argument as 'data', followed by the columns to expand. By default, the fill is NA, but we can change it to 0 by specifying it in a list.

complete(df, PERSON, FRUIT, fill = list(A=0, B = 0))

这篇关于如何完成数据框中的缺失因子水平?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆