读取链(+, - )列与fread,data.table包 [英] reading strand (+, -) column with fread, data.table package

查看:124
本文介绍了读取链(+, - )列与fread,data.table包的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我试图使用fread读取基因组比对到 data.table 在R.这是一个快照的对齐文件:

  USI-EAS28:1:100:1786:674#0/1 + 1_maternal 68326824 CTCAATTATACTGAAAGAAACACAATATATCATA IIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIII 0 
USI-EAS28:1:100 :1786:940#0/1 + 16_maternal 11407541 CTATTAGTGACCTGCTGTGGGACCTTGGGATGGT IIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIII 0
USI-EAS28:1:100:1786:705#0/1 + 1_maternal 63849584 CTGAGGGTTTGTGTCAGGAAGGGGTGTGGAATTG IIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIII 0 0:T&以及c
USI- EAS28:1:100:1786:1168#0/1 - 5_maternal 31381649 GCATCATTCATGAAACAATTTTCAAGAGAGGAAA IIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIII 0
USI-EAS28:1:100:1787年:582#0/1 + 10_maternal 54587781 CTACAATAATAATAGGGGACTAAAACACCCCACT IIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIII 0
USI- EAS28:1:100:1787:62#0/1 + 10_maternal 70390747 CTATTTGCTACTGAATTGTTAATTTTAAAACAGT IIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIII 0
USI-EAS28:1:100:1788:573#0/1 - 7_maternal 92583837 CACTGTCAACATTAGACAGACCAATGAGACAAAG IIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIII 0
USI- EAS28:1:100:1788:854#0/1 + 7_maternal 129611206 GTTTGTTTTTTTTTTTGAGATGGAGTCTCATTTT IIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIII 0 32:C>吨
USI-EAS28:1:100:1788:185#0/1 - 13_maternal 23694307 CAAACAAACTCAAAATGGACTATCGACTGAAAAA IIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIII 0
USI-EAS28:1:100:1788:1339#0/1 - 13_maternal 33699510 TTAACTCTAGTTTTTAGGGATTGCAAATTAGACG IIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIII 0 0:A>克

第二列报告读对齐的链( + 向前, - 是反向)。不幸的是,fread试图读取这个列为一个整数,将值总是为0.这个列应该被读为一个字符,甚至一个布尔值。尝试使用参数 sep sep2 不起作用。

解决方案

感谢您报告。现在固定在v1.8.9 commit 849中。 + - 现在读为字符,test添加。 >

Btw,我们还打算添加 colClasses ,以便您可以覆盖 fread 检测。与 fread 相关的未完成的待办事项列表位于源文件的顶部:

https://r-forge.r-project.org/scm /viewvc.php/pkg/src/fread.c?view=markup&root=datatable


I'm trying to use fread to read a genome alignment into a data.table in R. Here's a snapshot of the alignment file:

USI-EAS28:1:100:1786:674#0/1    +   1_maternal  68326824      CTCAATTATACTGAAAGAAACACAATATATCATA    IIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIII  0   
USI-EAS28:1:100:1786:940#0/1    +   16_maternal 11407541    CTATTAGTGACCTGCTGTGGGACCTTGGGATGGT  IIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIII  0   
USI-EAS28:1:100:1786:705#0/1    +   1_maternal  63849584    CTGAGGGTTTGTGTCAGGAAGGGGTGTGGAATTG  IIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIII  0   0:T>C
USI-EAS28:1:100:1786:1168#0/1   -   5_maternal  31381649    GCATCATTCATGAAACAATTTTCAAGAGAGGAAA  IIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIII  0   
 USI-EAS28:1:100:1787:582#0/1   +   10_maternal 54587781    CTACAATAATAATAGGGGACTAAAACACCCCACT  IIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIII  0   
 USI-EAS28:1:100:1787:62#0/1    +   10_maternal 70390747     CTATTTGCTACTGAATTGTTAATTTTAAAACAGT IIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIII  0   
 USI-EAS28:1:100:1788:573#0/1   -   7_maternal  92583837     CACTGTCAACATTAGACAGACCAATGAGACAAAG IIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIII  0   
 USI-EAS28:1:100:1788:854#0/1   +   7_maternal  129611206    GTTTGTTTTTTTTTTTGAGATGGAGTCTCATTTT IIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIII  0   32:C>T
 USI-EAS28:1:100:1788:185#0/1   -   13_maternal 23694307    CAAACAAACTCAAAATGGACTATCGACTGAAAAA  IIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIII  0   
 USI-EAS28:1:100:1788:1339#0/1  -   13_maternal 33699510    TTAACTCTAGTTTTTAGGGATTGCAAATTAGACG  IIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIII  0   0:A>G

The second column reports the strand the read aligns to (+ is forward, - is reverse). Unfortunately fread is trying to read this column into an integer, assigning the value always to 0. This column should be read as a character, or even a boolean, for that matter. Trying to play with arguments sep and sep2 doesn't help.

解决方案

Thanks for reporting. Now fixed in v1.8.9 commit 849. + and - are now read as character, test added.

Btw, we are also intending to add colClasses so that you can override the column type that fread detects. The outstanding to do list relating to fread is at the top of the source file here :
https://r-forge.r-project.org/scm/viewvc.php/pkg/src/fread.c?view=markup&root=datatable

这篇关于读取链(+, - )列与fread,data.table包的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆