两列的连接因子级别 [英] Joining factor levels of two columns
问题描述
我有2列数据,它们具有相同类型的数据(字符串)。
I have 2 columns of data with the same type of data (Strings).
我想加入这些列的级别。即。我们有:
I want to join the levels of the columns. ie. we have:
col1 col2
Bob John
Tom Bob
Frank Jane
Jim Bob
Tom Bob
... ... (and so on)
现在col1有4个级别(鲍勃,汤姆·弗兰克,吉姆),而col2有3个级别(约翰,简,鲍勃)
now col1 has 4 levels (Bob, Tom Frank, Jim) and col2 has 3 levels (John, Jane, Bob)
但我希望这两列都具有所有因子级别(鲍勃,汤姆,弗兰克,吉姆,简,约翰),以便以后用唯一的ID替换每个名称,这样最终输出将是:
But I want both columns to have all the factor levels (Bob, Tom, Frank, Jim, Jane, John), as to later replace each of the 'names' with a unique id, such that the final output would be:
col1 col2
1 5
2 1
3 6
4 1
2 1
在两列中都是Bob-> 1,Tom-> 2,等等。
that is Bob -> 1, Tom -> 2, etc. in both columns.
任何想法:)?
编辑:谢谢大家的精彩回答!据我所知,您都很棒:)
edit: Thanks all for the wonderful answers! You are all awesome as far as I know :)
推荐答案
您希望这些因素包括两列中的所有唯一名称。
You want the factors to include all the unique names from both columns.
col1 <- factor(c("Bob", "Tom", "Frank", "Jim", "Tom"))
col2 <- factor(c("John", "Bob", "Jane", "Bob", "Bob"))
mynames <- unique(c(levels(col1), levels(col2)))
fcol1 <- factor(col1, levels = mynames)
fcol2 <- factor(col2, levels = mynames)
编辑:如果您用以下内容替换第三行,会更好一些:
a little nicer if you replace the third line with this:
mynames <- union(levels(col1), levels(col2))
这篇关于两列的连接因子级别的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!