在R中连接两列的因子水平 [英] Joining factor levels of two columns in R
问题描述
我想加入列的级别。即我们有:
col1 col2
Bob John
Tom Bob
Frank Jane
Jim Bob
汤姆·鲍伯
...(等等)
现在col1有4个级别(Bob,Tom Frank,Jim),col2有3个级别(John,Jane,Bob)
但是我希望两列都有所有因素水平(Bob,Tom,Frank,Jim,Jane,John),以后用一个唯一的ID替换每个名称,以便最终的输出将是:
col1 col2
1 5
2 1
3 6
4 1
2 1
这两个列都是Bob - > 1,Tom - > 2等。
任何想法:)?
编辑:感谢所有的精彩答案!你真的很棒,据我所知:)
你希望这些因素包括两列的所有唯一的名称。
col1< - 因子(c(Bob,Tom,Frank,Jim )
col2< - factor(c(John,Bob,Jane,Bob,Bob))
mynames< - unique(c col1),levels(col2)))
fcol1< - factor(col1,levels = mynames)
fcol2< - 因子(col2,levels = mynames)
编辑:如果用第三行代替,稍微更好一点:
mynames< - union(levels(col1),levels(col2))
I have 2 columns of data with the same type of data (Strings).
I want to join the levels of the columns. ie. we have:
col1 col2
Bob John
Tom Bob
Frank Jane
Jim Bob
Tom Bob
... ... (and so on)
now col1 has 4 levels (Bob, Tom Frank, Jim) and col2 has 3 levels (John, Jane, Bob)
But I want both columns to have all the factor levels (Bob, Tom, Frank, Jim, Jane, John), as to later replace each of the 'names' with a unique id, such that the final output would be:
col1 col2
1 5
2 1
3 6
4 1
2 1
that is Bob -> 1, Tom -> 2, etc. in both columns.
Any ideas :) ?
edit: Thanks all for the wonderful answers! You are all awesome as far as I know :)
You want the factors to include all the unique names from both columns.
col1 <- factor(c("Bob", "Tom", "Frank", "Jim", "Tom"))
col2 <- factor(c("John", "Bob", "Jane", "Bob", "Bob"))
mynames <- unique(c(levels(col1), levels(col2)))
fcol1 <- factor(col1, levels = mynames)
fcol2 <- factor(col2, levels = mynames)
EDIT: a little nicer if you replace the third line with this:
mynames <- union(levels(col1), levels(col2))
这篇关于在R中连接两列的因子水平的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!