如何完成数据框中的缺失因子水平? [英] How to complete missing factor levels in data frame?
本文介绍了如何完成数据框中的缺失因子水平?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!
问题描述
假装我有这样的东西:
df <- data.frame(
PERSON = c("Peter", "Peter", "Marcel" , "Lisa", "Lisa"),
FRUIT = c("Apple", "Peach","Apple", "Apple", "Peach" ),
A = c(100, 200, 100, 200, 300),
B=c(1,2,3,4,5) )
df$PERSON <- as.factor(df$Person)
df$FRUIT <- factor(df$FRUIT, levels = c("Apple", "Peach", "Coconut"))
有哪些结果
str(df): 'data.frame': 5 obs. of 4 variables:
$ PERSON: Factor w/ 3 levels "Lisa","Marcel",..: 3 3 2 1 1
$ FRUIT : Factor w/ 3 levels "Apple","Peach",..: 1 2 1 1 2
$ A : num 100 200 100 200 300
$ B : num 1 2 3 4 5
我想扩展此数据的框架,以便每个人都有所有级别的水果,例如:
I want to expand this data, frame so that for every PERSON there are all levels of FRUIT present, like this:
Person FRUIT A B
1 Peter Apple 100 1
2 Peter Peach 200 2
3 Peter Coconut 0 0
4 Marcel Apple 100 3
5 Marcel Peach 0 0
6 Marcel Coconut 0 0
7 Lisa Apple 200 4
8 Lisa Peach 300 5
9 Lisa Coconut 0 0
A
和 B
的缺失值应该用0填充。
Missing values for A
and B
should be filled with 0.
我尝试了 tidyr :: complete(df $ FRUIT,0)
,但看来,我
推荐答案
complete
将第一个参数用作数据,后跟t他专栏扩大。默认情况下,填充
是NA,但是我们可以通过在列表
中指定它来将其更改为0。
The complete
takes the first argument as 'data', followed by the columns to expand. By default, the fill
is NA, but we can change it to 0 by specifying it in a list
.
complete(df, PERSON, FRUIT, fill = list(A=0, B = 0))
这篇关于如何完成数据框中的缺失因子水平?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!
查看全文