如何计算文件中每一行的特定字符数? [英] how to count the number of specific characters through each line from file?

查看：50 发布时间：2021/6/15 20:59:10 perl fasta

本文介绍了如何计算文件中每一行的特定字符数?的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我正在尝试计算 FASTA 文件中N"的数量，即:

I'm trying to count the number of 'N's in a FASTA file which is:

>Header
AGGTTGGNNNTNNGNNTNGN
>Header2
AGNNNNNNNGNNGNNGNNGN

所以最后我想得到N"的数量，每个标题都是一个读取，所以我想制作一个直方图，所以我最后会输出这样的东西:

so in the end I want to get the count of number of 'N's and each header is a read so I want to make a histogram so I would at the end output something like this:

# of N's   # of Reads

0            300
1            240

等等...

所以有 300 个序列或读取的 'N's 数为 0

so there are 300 sequences or reads that have 0 number of 'N's

use strict;
use warnings;

my $file = shift;
my $output_file = shift;

my $line;
my $sequence;
my $length;
my $char_N_count = 0;
my @array;
my $count = 0;

if (!defined ($output_file)) {
    die "USAGE: Input FASTA file\n";
}
open (IFH, "$file") or die "Cannot open input file$!\n";
open (OFH, ">$output_file") or die "Cannot open output file $!\n";

while($line = <IFH>) {
chomp $line;
next if $line =~ /^>/;
$sequence = $line;
@array = split ('', $sequence);
foreach my $element (@array) {
if ($element eq 'N') {
$char_N_count++;
 }
 }
  print "$char_N_count\n";
 }

推荐答案

试试这个.我改变了一些东西，比如使用标量文件句柄.在 Perl 中有很多方法可以做到这一点，所以有些人会有其他想法.在这种情况下，我使用了一个可能有间隙的数组 - 另一种选择是按计数将结果存储在哈希和键中.

Try this. I changed a few things like using scalar file handles. There are many ways to do this in Perl, so some people will have other ideas. In this case I used an array which may have gaps in it - another option is to store results in a hash and key by the count.

刚刚意识到我没有使用 $output_file，因为我不知道你想用它做什么 :) 如果你的意图是写，只需将末尾的 'print' 更改为 'print $out_fh'到它.

Just realised I'm not using $output_file, because I have no idea what you want to do with it :) Just change the 'print' at the end to 'print $out_fh' if your intent is to write to it.

use strict;
use warnings;

my $file = shift;
my $output_file = shift;

if (!defined ($output_file)) {
    die "USAGE: $0 <input_file> <output_file>\n";
}
open (my $in_fh, '<', $file) or die "Cannot open input file '$file': $!\n";
open (my $out_fh, '>', $output_file) or die "Cannot open output file '$output_file': $!\n";

my @results = ();
while (my $line = <$in_fh>) {
    next if $line =~ /^>/;
    my $num_n = ($line =~ tr/N//);
    $results[$num_n]++;
}

print "# of N's\t# of Reads\n";

for (my $i = 0; $i < scalar(@results) ; $i++) {
    unless (defined($results[$i])) {
        $results[$i] = 0;
        # another option is to 'next' if you don't want to show the zero totals
    }
    print "$i\t\t$results[$i]\n";
}
close($in_fh);
close($out_fh);
exit;

这篇关于如何计算文件中每一行的特定字符数?的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

如何计算文件中每一行的特定字符数? [英] how to count the number of specific characters through each line from file?

问题描述

推荐答案

相关文章

其他开发最新文章

热门教程

热门工具

登录关闭

如何计算文件中每一行的特定字符数? [英] how to count the number of specific characters through each line from file?

问题描述

推荐答案

相关文章

其他开发最新文章

热门教程

热门工具

登录 关闭

登录关闭