从R中ifelse语句的列中添加多个整数范围的值 [英] Adding multiple integer ranges of values from a column in the ifelse statement in R
问题描述
我正在处理基因组数据,我在核苷酸位置及其保守性分数(在数据框中)列中有列.我有关于哪些核苷酸位置范围是内含子和哪些是外显子的数据.我想创建第三列,并能够指定哪些区域是内含子(如"INTRON"),哪些区域是外显子(如"EXON").
I am dealing with genomic data and I have columns on nucleotide position and its conservation score (in a dataframe). I have data regarding which range of nucleotide positions are introns and which are exons. I want to create a third column and be able to specify which regions are introns (as "INTRON") and which are exons (as "EXON").
作为一个例子,假设在1-70000核苷酸位置,我想将10000-10200、17800-21000、43000-54000指定为内含子,并在另一列中保留为外显子(假设数据).有没有一种方法可以从ifelse函数中的列中指定多个值范围,因为这或多或少会解决我的问题.有更好的方法吗?
As an example suppose in nucleotide positions 1-70000, I want to specify 10000-10200, 17800-21000, 43000-54000 as introns and remaining as exons in another column (hypothetical data). Is there a way of specifying multiple ranges of values from a column in the ifelse function, as that would more or less solve my problem. Is there a better way of doing it ?
推荐答案
假设您拥有这样的数据框:
Assuming you have data frame like that:
d <- data.frame(position=round(runif(100, 1, 70000)))
您可以组合逻辑运算符:
You can combine logical operators:
d$status <- ifelse(( d$position >= 1000 & d$position <= 10200) | (d$position >= 17800 & d$position <= 21000) | (d$position >= 43000 & d$position <= 54000), 'INTRON', 'EXON')
或者您可以使用嵌套ifelse:
or you can use nested ifelse:
d$status <- ifelse(d$position >= 1000 & d$position <= 10200, 'INTRON', felse(d$position >= 17800 & d$position <= 21000, 'INTRON', ifelse(d$position >= 43000 & d$position <= 54000, 'INTRON', 'EXON')))
这篇关于从R中ifelse语句的列中添加多个整数范围的值的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!