用R查找范围内的重叠 [英] Finding overlap in ranges with R

查看：84 发布时间：2020/9/21 3:04:30 r bioinformatics

本文介绍了用R查找范围内的重叠的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我有两个data.frames，每个都有三列:chrom，start&停下来，我们称它们为rangeA和rangeB.对于rangeA的每一行，我正在寻找rangeB中的哪一行(如果有)完全包含rangeA的行-我的意思是rangesAChrom == rangesBChrom, rangesAStart >= rangesBStart and rangesAStop <= rangesBStop.

I have two data.frames each with three columns: chrom, start & stop, let's call them rangesA and rangesB. For each row of rangesA, I'm looking to find which (if any) row in rangesB fully contains the rangesA row - by which I mean rangesAChrom == rangesBChrom, rangesAStart >= rangesBStart and rangesAStop <= rangesBStop.

现在我正在做以下事情，我只是不太喜欢.请注意，由于其他原因，我正在遍历rangeA的行，但是这些原因都不是什么大不了的，给定这个特定的解决方案，它最终只会使事情变得更具可读性.

Right now I'm doing the following, which I just don't like very much. Note that I'm looping over the rows of rangesA for other reasons, but none of those reasons are likely to be a big deal, it just ends up making things more readable given this particular solution.

范围A:

chrom   start   stop
 5       100     105
 1       200     250
 9       275     300

范围B:

chrom    start    stop
  1       200      265
  5       99       106
  9       275      290

对于范围A中的每一行:

for each row in rangesA:

matches <- which((rangesB[,'chrom']  == rangesA[row,'chrom']) &&
                 (rangesB[,'start'] <= rangesA[row, 'start']) &&
                 (rangesB[,'stop'] >= rangesA[row, 'stop']))

我认为，有一种比循环遍历此构造更好的方法(更好的是，在rangeA和rangeB的大型实例上，它的执行速度更快).有什么想法吗?

I figure there's got to be a better (and by better, I mean faster over large instances of rangesA and rangesB) way to do this than looping over this construct. Any ideas?

用R查找范围内的重叠 [英] Finding overlap in ranges with R

问题描述

推荐答案

相关文章

其他开发最新文章

热门教程

热门工具

登录关闭

用R查找范围内的重叠 [英] Finding overlap in ranges with R

问题描述

推荐答案

相关文章

其他开发最新文章

热门教程

热门工具

登录 关闭

登录关闭