如何在bash中仅将某些数字保留为巨型数字? [英] How to keep only certain numbers in a giant number in bash?
问题描述
我有一个巨大的文件,其中包含基因型.基本上,一行是一个位点(或一个SNP)和一列(所有数字连接在一起形成一个大数字,但在同一列中对齐的一个数字是1个个体.在此示例中,我连续有96个数字,因此96个人).这是一个示例:
I have a huge file that contains genotypes. Basically, one line is a loci (or a SNP) and a column (all the numbers are concatenated together to form one giant number, but one number align in one column is 1 individual. In this example I have 96 number in a row so 96 individuals). Here is an example:
921212922222222212292222229222221222211222222222222219929222292222922229919922222222222222292292
929111221111111221191211222912222221111210229921222129929222291221921219929992122122222211292299
292222922212222122292222222222921122222222921219222222912222299199922222912222222222221222292229
222222221122122922122222112212212221222122221922999229222229222212992221222222221222222222222212
222222222292212221291112192222122121922122222122229212222221212212922221222122122912222922222111
222222921222222922292222122222922222229222122291299122922222229222922229229222219222292222122222
我只想保留某些列在这里",但是由于它是一个数字,因此我需要将其剪切,放在不同的列中,然后将所有内容连接起来以具有相同的格式,但需要使用2列.
I want to keep only certain "columns here", but since it's one number, I would need to cut it, put it in different columns and concatenate everything to have the same format but with the 2 columns that I need.
例如,如果我选择第1列和第3列,则最终结果应为:
For example, if I select column 1 and 3 the end result should be:
91
99
22
22
22
22
我已经尝试过了(上面的数据在output.geno中):
I've tried this (the data above is in output.geno):
cat ~/Desktop/output.geno| awk '{ print $1 $3}'
echo ~/Desktop/output.geno | grep -o ""
如果您想使用它,这里是一个玩具数据集:
If you want to play with that here is a toy dataset:
echo "921212922222222212292222229222221222211222222222222219929222292222922229919922222222222222292292
929111221111111221191211222912222221111210229921222129929222291221921219929992122122222211292299
292222922212222122292222222222921122222222921219222222912222299199922222912222222222221222292229
222222221122122922122222112212212221222122221922999229222229222212992221222222221222222222222212
222222222292212221291112192222122121922122222122229212222221212212922221222122122912222922222111
222222921222222922292222122222922222229222122291299122922222229222922229229222219222292222122222" > ~/Desktop/output.geno
推荐答案
您可以使用cut
:
cut -c 1,3 output.geno
赠予:
91
99
22
22
22
22
这篇关于如何在bash中仅将某些数字保留为巨型数字?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!