字典帮助Julia - 从文本文件创建字典 [英] Dictionary help in Julia - creating dictionary from text file

查看:265
本文介绍了字典帮助Julia - 从文本文件创建字典的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我试图从Julia中的文本文件的内容创建一个库,用于生物信息学问题。该文件格式如下:

  UUU F CUU L AUU I GUU V 
UUC F CUC L AUC I GUC V
...

我想制作一个字母,其中的关键是3个字母部分(密码子),条目是单字母部分(氨基酸)。我可以在每个匹配中使用grep:

 取出正确的组件(r([AUGC] {3 ,3})\s([AZ]),文件)
密码子,aa = m.captures

如果我在此循环中打印密码子 aa ,我得到正确的输出(全部密码子,所有的aa),但我不知道如何把它放入字典。如果我这样做: codons = {codon => aa} 在循环结束时,我最终只能包含最后一个条目。



我确定语法是一件非常明显的事情,但我是一名生物学家,而不是一名程序员,所以我读了文档没有让我在任何地方。它说:


给定一个字典D,语法D [x]返回键x的值(如果存在)或抛出一个错误,D [x] = y存储D中的键值对x => y(替换关键字x的任何现有值)。


但是我在循环结束时尝试了密码子[codon] = aa (我用密码子= {} 在循环之前),但是我收到错误:

 没有方法setindex!(Array {Any, 1},SubString {UTF8String},SubString {UTF8String})
在In [35]:5
在匿名的无文件:4

任何帮助将不胜感激。



编辑:显然,我没有正确启动字典。如果我做密码子= {blah=> blahblah} 开始时,循环工作并正确填写。所以一个修改后的问题:你如何启动空库?



EDIT2:最小不工作的例子:

  file = open(readall,rna_codons.txt)
密码子= {}
在每个匹配中的m(r([AUGC] {3,3})\\ \\ s([AZ]),文件)
密码子,aa = m.capture
密码子[codon] = aa
end
pre>

解决方案

只是为了总结一个最小工作示例(MWE),将您的格式化文本文件读入Julia Dict ...

  file = open(readall,rna_codons.txt)
密码子= Dict()
for each match r([AUGC] {3,3})\s([AZ]),文件)
密码子,aa = m.capture
密码子[codon] = aa
end

注意:如果文件非常大,可能会有更快的方式生成您的字典



编辑



明显的文本文件格式,这里是另一种创建您的 Dict 的方法。我没有测试确定任何性能损失/收益。

  condon_array = open(readdlm,rna_codons.txt)
condons = Dict {ASCIIString,ASCIIString}(condon_array [:,1:2:end] [:],condon_array [:,2:2:end] [:])

注意:如果您使用它,请更好地查看它的正确性。


I'm attempting to create a library from the contents of a text file in Julia for use in a bioinformatics problem. The file is formatted like this:

UUU F      CUU L      AUU I      GUU V
UUC F      CUC L      AUC I      GUC V
...

I want to make a dictionary where the key is the 3 letter part (the codon), and the entry is the one letter part (the amino acid). I'm able to pull out the right components with grep:

for m in eachmatch(r"([AUGC]{3,3})\s([A-Z])", file)
    codon, aa = m.captures

If I print codon and aa in this loop, I get out the correct output (all the codon's, all the aa's) but I can't figure out how to put it into a dictionary. If I do: codons = {codon => aa} at the end of the loop, I end up with a dictionary that only contains the last entry.

I'm sure the syntax is something really obvious, but I'm a biologist, not a programmer, so my reading of the documentation isn't getting me anywhere. It says:

Given a dictionary D, the syntax D[x] returns the value of key x (if it exists) or throws an error, and D[x] = y stores the key-value pair x => y in D (replacing any existing value for the key x).

But I tried codons[codon] = aa at the end of the loop (I initiated the dictionary with codons = {} before the loop), but I get the error:

no method setindex!(Array{Any,1},SubString{UTF8String},SubString{UTF8String})
at In[35]:5
 in anonymous at no file:4

Any help would be greatly appreciated.

EDIT: Evidently, I'm not initiating the dictionary correctly. If I do codons = {"blah" => "blahblah"} at the beginning, the loop works and fills in correctly. So a modified question: how do you initiate empty libraries?

EDIT2: Minimal not working example:

file = open(readall, "rna_codons.txt")
codons = {}
for m in eachmatch(r"([AUGC]{3,3})\s([A-Z])", file)
    codon, aa = m.capture
    codons[codon] = aa
end

解决方案

Just to summarize a Minimal Working Example (MWE) for your case of reading your formatted text file into a Julia Dict...

file = open(readall, "rna_codons.txt")
codons = Dict()
for m in eachmatch(r"([AUGC]{3,3})\s([A-Z])", file)
    codon, aa = m.capture
    codons[codon] = aa
end

N.B.: If the file is very large, there is likely a faster way of generating your Dict.

EDIT

Given your apparent text file format, here's another way to create your Dict. I made no tests to determine any performance loss/gain.

condon_array = open(readdlm, "rna_codons.txt")
condons = Dict{ASCIIString,ASCIIString}(condon_array[:,1:2:end][:],condon_array[:,2:2:end][:])

N.B.: If you use it, better check it for correctness.

这篇关于字典帮助Julia - 从文本文件创建字典的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆