根据共享值和R中的字符串为数据框分配颜色 [英] Assign colors to a data frame based on shared values with a character string in R
问题描述
我正在R中工作。我有许多不同的数据框,其中都带有样品名称,我正在尝试根据样品名称为每个数据框中的每一行分配一种颜色。其中有许多行具有相同的样品名称,但是我的输出数据比较混乱,因此无法按样品名称进行排序。这是我所拥有的一个小例子
I'm working in R. I have many different data frames that have sample names in them and I'm trying to assign a color to each row in each data frame based on the sample names. There are many rows that have the same sample names in them, but I have messy output data so I can't sort by sample name. Here's a small example case of what I have
names <- c( "TC3", "102", "172", "136", "142", "143", "AC2G" )
colors <- c( "darkorange", "forestgreen", "darkolivegreen", "darkgreen", "darksalmon", "firebrick3", "firebrick1" )
dataA <- c( "JR13-101A", "TC3B", "JR12-136C", "AC2GA", "TC3A" )
newcolors <- rep( NA, length( dataA ) )
dataA <- as.data.frame( cbind( dataA, newcolors ) )
并且我尝试了以下操作(我知道有循环,但这就是我想做的全部)。我也想摆脱R循环的困扰,但我还没有改变习惯。
这就是我尝试过的方法。可能很明显,但是对于所有 newcolors
and I've tried the following (with loops, I know, but that's all I could think to do). I'm also trying to get away from falling back on loops in R, but I have yet to break the habit.
Here's what I've tried. Probably something obvious, but I just get NA
returned for all the newcolors
for( i in 1:nrow( dataA ) ) {
for( j in 1:length( names ) ) {
if( grepl( dataA$dataA[ i ], names[ j ] ) ) {
dataA$newcolors[ i ] <- colors[ j ]
}
}
}
推荐答案
这是一个解决方案,它消除了1个循环:
Here is a solution, which eliminates 1 loop:
dataA$newcolors<-as.character(dataA$newcolors)
for( j in 1:length( names ) ) {
dataA$newcolors[grep(names[j], dataA$dataA)] <- colors[j]
}
将newcolors列转换为字符而不是因数会使更新变得更加容易。如果名称的数量很短,那么单循环对性能的影响不会很大。
Converting the newcolors column to character instead of a factor makes the updating much easier. If the number of names is short then there should not be much of a performance impact with the single loop.
这篇关于根据共享值和R中的字符串为数据框分配颜色的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!