如何在 Perl 中重写此代码的一行代码(或命令行中的较少行代码)? [英] How to Rewrite of One Line Code (or Less Line Code in command line) of this code in Perl?

查看:59
本文介绍了如何在 Perl 中重写此代码的一行代码(或命令行中的较少行代码)?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个这样的代码:

#!/usr/bin/perl
use strict;
use warnings;      
my %proteins = qw/
    UUU F UUC F UUA L UUG L UCU S UCC S UCA S UCG S UAU Y UAC Y UGU C UGC C UGG W
    CUU L CUC L CUA L CUG L CCU P CCC P CCA P CCG P CAU H CAC H CAA Q CAG Q CGU R CGC R CGA R CGG R
    AUU I AUC I AUA I AUG M ACU T ACC T ACA T ACG T AAU N AAC N AAA K AAG K AGU S AGC S AGA R AGG R
    GUU V GUC V GUA V GUG V GCU A GCC A GCA A GCG A GAU D GAC D GAA E GAG E GGU G GGC G GGA G GGG G
    /;
open(INPUT,"<dna.txt");
while (<INPUT>) {    
    tr/[a,c,g,t]/[A,C,G,T]/;
    y/GCTA/CGAU/;    
    foreach my $protein (/(...)/g) {
        if (defined $proteins{$protein}) {
        print $proteins{$protein};
        }
}
}
close(INPUT);

此代码与我的其他问题的答案有关:DNA 到 RNA和使用 Perl 获取蛋白质

This code is related to my other question's answer: DNA to RNA and Getting Proteins with Perl

程序的输出为:

SIMQNISGREAT

如何使用 Perl 重写该代码,它会在命令行上运行,并且会用更少的代码(如果可能的话,一行代码)重写?

How can I rewrite that code with Perl, it will run on command line and it will be rewritten with less code(if possible one line code)?

PS 1: dna.txt 是这样的:

PS 1: dna.txt is like that:

TCATAATACGTTTTGTATTCGCCAGCGCTTCGGTGT

PS 2:如果代码行更少,则可以将 my %proteins 变量写入文件.

PS 2: If the code will be less line, it is accepted to write the my %proteins variable into a file.

推荐答案

有人 (@kamaci) 在另一个线程中呼唤了我的名字.这是我在命令行上保留蛋白质表时能想到的最好的方法:

Somebody (@kamaci) called my name in another thread. This is the best I can come up with while keeping the protein table on the command line:

perl -nE'say+map+substr("FYVDINLHL%VEMKLQL%VEIKLQFYVDINLHCSGASTRPWSGARTRP%SGARTRPCSGASTR",(s/GGG/GGC/i,vec($_,0,32)&101058048)%63,1),/.../g' dna.txt

(Shell 引用,对于 Windows 引用交换 '" 字符.此版本用 % 标记无效密码子,您可能可以修复通过在适当的位置添加 =~y/%//d 来实现.

(Shell quoting, for Windows quoting swap ' and " characters). This version marks invalid codons with %, you can probably fix that by adding =~y/%//d at an appropriate spot.

提示:这从 RNA 三元组的原始 ASCII 编码中挑选出 6 位,给出 0 到 101058048 之间的 64 个代码;为了获得字符串索引,我将结果以 63 为模减少,但这会创建一个双重映射,遗憾的是不得不对两种不同的蛋白质进行编码.s/GGG/GGC/i 将其中一个映射到编码正确蛋白质的另一个.

Hint: This picks out 6 bits from the raw ASCII encoding of an RNA triple, giving 64 codes between 0 and 101058048; to get a string index, I reduce the result modulo 63, but this creates one double mapping which regrettably had to code two different proteins. The s/GGG/GGC/i maps one of them to another that codes the right protein.

还要注意 % 运算符前的括号,both, 运算符与 substr and 修正 &% 的优先级.如果你曾经在生产代码中使用它,你就是一个坏人.

Also note the parentheses before the % operator which both isolate the , operator from the argument list of substr and fix the precedence of & vs %. If you ever use that in production code, you're a bad, bad person.

这篇关于如何在 Perl 中重写此代码的一行代码(或命令行中的较少行代码)?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆