增加可视重叠段的表现 [英] Increasing the performance of visualising overlapping segments

查看：132 发布时间：2015/11/30 20:25:00 performance algorithm r graph segments

本文介绍了增加可视重叠段的表现的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我有一组对x指向的绘制段沿x轴建立在研发一个定制的阅读图：

I have a set of pairs of x points to draw segments along the x axis to create a custom read map in R:

性作图这些段是决定其y位置，以便重叠没有两个段在同一Y水平一半的任务。对于每个段，我遍历从第一位置y的水平，直到我到达那个尚不包含一个段，将重叠的当前的位置。然后我记录当前段的结束位置，并移动到下一个。

Half the task of plotting these segments is deciding their y positions so that no two segments that overlap are on the same y level. For each segment, I iterate over y levels from the first position until I get to a position that does not yet contain a segment that will overlap the current one. I then record the end position of the current segment and move to the next one.

实际的code是一个功能如下：

The actual code is a function as follows:

# Dummy data
# A list of start and end positions for each segment along the X axis. Sorted by start.
# Passing the function few.reads draws a map in half a second. Passing it many.reads takes about half an hour to complete.
few.reads <- data.frame( start=c(rep(10,150), rep(16,100), rep(43,50)), end=c(rep(30,150), rep(34,100), rep(57,50)) );
many.reads <- data.frame( start=c(rep(10,15000), rep(16,10000), rep(43,5000)), end=c(rep(30,15000), rep(34,10000), rep(57,5000)) );

#---
# A function to draw a series of overlapping segments (or "reads" in my along
# The x-axis. Where reads overlap, they are "stacked" down the y axis
#---
drawReads <- function(reads){

    # sort the reads by their start positions
    reads <- reads[order(reads$start),];

    # minimum and maximum for x axis
    minstart <- min(reads$start);
    maxend <- max(reads$end);

    # initialise yread: a list to keep track of used y levels
    yread <- c(minstart - 1);
    ypos <- c(); #holds the y position of the ith segment

    #---
    # This iteration step is the bottleneck. Worst case, when all reads are stacked on top
    # of each other, it has to iterate over many y levels to find the correct position for
    # the later reads
    #---
    # iterate over segments
    for (r in 1:nrow(reads)){
        read <- reads[r,];
        start <- read$start;
        placed <- FALSE;

        # iterate through yread to find the next availible
        # y pos at this x pos (start)
        y <- 1;
        while(!placed){

            if(yread[y] < start){
                ypos[r] <- y;
                yread[y] <- read$end;
                placed <- TRUE;
            } 

            # current y pos is used by another segment, increment
            y <- y + 1;
            # initialize another y pos if we're at the end of the list
            if(y > length(yread)){
                yread[y] <- minstart-1;
            }
        }
    }

    #---
    # This is the plotting step
    # Once we are here the rest of the process is very quick
    #---
    # find the maximum y pos that is used to size up the plot
    maxy <- length(yread);
    miny = 1;


    reads$ypos <- ypos + miny;

    print("New Plot...")
    # Now we have all the information, start the plot
    plot.new();
    plot.window(xlim=c(minstart, maxend+((maxend-minstart)/10)), ylim=c(1,maxy));

    axis(3,xaxp=c(minstart,maxend,(maxend-minstart)/10));
    axis(2, yaxp=c(miny,maxy,3),tick=FALSE,labels=FALSE);

    print("Draw the reads...");
    maxy <- max(reads$ypos);
    segments(reads$start, maxy-reads$ypos, reads$end, maxy-reads$ypos, col="blue");   
}

我的实际数据集是非常大的，并包含最多可以有60万，据我可以告诉读取的区域。的读取自然就会堆叠在彼此的顶部，因此它是很容易实现的最坏的情况下，其中所有读取相互重叠。所花费的时间来绘制大量的阅读是不能接受的我，所以我在寻找一种方法，使这个过程更加高效。我可以用更快的东西取代我的循环？是否有一个算法，可以安排读取更快？我实在想不出在目前这样做的更好的方法。

My actual dataset is very large, and contains regions that can have up to 600000 reads as far as I can tell. The reads will naturally stack on top of each other, so it is very easy to realise the worst-case scenario, where all reads are overlapping each other. The time it takes to plot large numbers of reads is unacceptable for me, so I'm looking for a way to make the process more efficient. Can I replace my loops with something quicker? Is there an algorithm that can arrange the reads quicker? I really can't think of a better way of doing this at the moment.

感谢您的帮助。

增加可视重叠段的表现 [英] Increasing the performance of visualising overlapping segments

问题描述

推荐答案

相关文章

C/C++最新文章

热门教程

热门工具

登录关闭

增加可视重叠段的表现 [英] Increasing the performance of visualising overlapping segments

问题描述

推荐答案

相关文章

C/C++最新文章

热门教程

热门工具

登录 关闭

登录关闭