Perl:如何联接文本文件的两列，其中第一列的值应与第二列的值顺序匹配 [英] Perl: How to join two columns of a text file, in which values of the first column should match in order with the values of the second column

查看：216 发布时间：2020/9/21 3:11:51 perl bioinformatics

本文介绍了Perl:如何联接文本文件的两列，其中第一列的值应与第二列的值顺序匹配的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我是Perl编程的初学者.我现在正在研究的问题是如何从文本文件中获取基因长度.文本文件包含基因名称(第10列)，起始位点(第6列)，结束位点(第7列).长度可以从第6列和第7列的差异中得出.但是我的问题是如何将基因名称(来自第10列)与从第6列和第7列的差异中得出的相应差异进行匹配.非常感谢！/p>

I am a beginner with Perl programming. The problem I am working on right now is how to get the gene length from a text file. Text file contains the gene name (column 10), start site (column 6), end site (column 7). The length can be derived from the difference of column 6 and 7. But my problem is how to match the gene name (from column 10) with the corresponding difference derived from the difference of column 6 and column 7. Thank you very much!

open (IN, "Alu.txt");
open (OUT, ">Alu_subfamlength3.csv");

while ($a = <IN>) {
    @data = split (/\t/, $a);
    $list {$data[10]}++;
    $genelength {$data[7] - $data[6]};
}

foreach $sub (keys %list){
    $gene = join ($sub, $genelength);

    print "$gene\n";
}
close (IN);
close (OUT);

推荐答案

我不确定这一点，因为我没有看到您的数据.但我认为您正在为此付出不必要的努力.我认为每个基因所需的一切都在输入文件的一行中，因此您可以一次处理一行文件，而无需使用任何额外的变量.像这样:

I'm not sure about this as I haven't seen your data. But I think you're making this far harder than necessary. I think that everything you need for each gene is in a single line of the input file, so you can process the file a line at a time and not use any extra variables. Something like this:

open (IN, "Alu.txt");
open (OUT, ">Alu_subfamlength3.csv");

while ($a = <IN>) {
    @data = split (/\t/, $a);
    print "Gene: $data[10] / Length: ", $data[7] - $data[6], "\n";
}

但是我们可以做一些改进.首先，我们将停止使用$a(这是一个特殊变量，不应在随机代码中使用)，而是切换到$_.同时，我们将添加use strict和use warnings并确保声明了所有变量.

But there are some improvements we can make. First, we'll stop using $a (which is a special variable and shouldn't be used in random code) and switch to $_ instead. At the same time we'll add use strict and use warnings and ensure that all of our variables are declared.

use strict;
use warnings;

open (IN, "Alu.txt");
open (OUT, ">Alu_subfamlength3.csv");

while (<IN>) { # This puts the line into $_
    my @data = split (/\t/); # split uses $_ by default
    print OUT "Gene: $data[10] / Length: ", $data[7] - $data[6], "\n";
}

接下来，我们将删除split()调用上不必要的括号，并使用列表切片仅获取所需的值并将其存储在各个变量中.

Next we'll remove the unnecessary parentheses on the split() call and use a list slice to just get the values you want and store them in individual variables.

use strict;
use warnings;

open (IN, "Alu.txt");
open (OUT, ">Alu_subfamlength3.csv");

while (<IN>) { # This puts the line into $_
    my ($start, $end, $gene) = (split /\t/)[6, 7, 10]; # split uses $_ by default
    print OUT "Gene: $gene / Length: ", $end - $start, "\n";
}

接下来，我们将删除显式文件名.相反，我们将从STDIN中读取数据并将其写入STDOUT.这是一种常见的Unix/Linux方法，称为 I/O过滤器.它将使您的程序更加灵活(此外，更容易编写).

Next, we'll remove the explicit filenames. Instead, we'll read data from STDIN and write it to STDOUT. This is a common Unix/Linux approach called an I/O filter. It will make your program more flexible (and, as a bonus, easier to write).

use strict;
use warnings;

while (<>) { # Empty <> reads from STDIN
    my ($start, $end, $gene) = (split /\t/)[6, 7, 10];
    # print to STDOUT
    print "Gene: $gene / Length: ", $end - $start, "\n";
}

要使用此程序，我们使用称为 I/O重定向的操作系统功能.如果程序被称为filter_genes，我们将这样称呼它:

To use this program, we use an operating system feature called I/O redirection. If the program is called filter_genes, we would call it like this:

$ ./filter_genes < Alu.txt > Alu_subfamlength3.csv

如果将来文件名更改，则无需更改程序，只需更改调用它的命令行即可.

And if the names of your files change in the future, you don't need to change your program, just the command line that calls it.

这篇关于Perl:如何联接文本文件的两列，其中第一列的值应与第二列的值顺序匹配的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

Perl:如何联接文本文件的两列，其中第一列的值应与第二列的值顺序匹配 [英] Perl: How to join two columns of a text file, in which values of the first column should match in order with the values of the second column

问题描述

推荐答案

相关文章

其他开发最新文章

热门教程

热门工具

登录关闭

Perl:如何联接文本文件的两列，其中第一列的值应与第二列的值顺序匹配 [英] Perl: How to join two columns of a text file, in which values of the first column should match in order with the values of the second column

问题描述

推荐答案

相关文章

其他开发最新文章

热门教程

热门工具

登录 关闭

登录关闭