如何用parsec解析uniprot文件? [英] how to parse a uniprot-file with parsec?

查看:234
本文介绍了如何用parsec解析uniprot文件?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我是Haskell的新手,但它似乎是我想学习的一门强大的语言。我正在采用parsec上真实世界Haskell的一章中的一些代码。我试图让我自己的版本解析uniprot文件的内容。这是一个由记录组成的文件(以>开头),每个记录由行组成。我的代码看起来非常接近于示例中的内容,但是我收到了很多错误 - 主要是类型。我的例外是除了其他我读取的文件(IO字符串)的输出,而不是一个字符串。如果有人能帮我理解我的方法有什么问题,我将不胜感激......

  import Text.ParserCombinators.Parsec 

main :: IO()

parseSprot :: IO String - >或者ParseError [[String]]

parseSprot input = parse uniprotFile(unknown)input
where
uniprotFile = endBy record eol
record = sepBy lines(char '>')
lines = many(noneOf,\ n)
eol = char'\\\
'

main = do
parseSprot $ readFileuniprot_sprot.fasta
putStrLnhey


解决方案

code> parseSprot 在签名中不需要 IO

  parseSprot :: String  - > ParseError [[String]] 
...

readFile 是一个 IO字符串。您可以通过将 readFile 动作的结果绑定到新的字符串 IO 操作。在 do 表示法中,可以将结果绑定到一个变量,其中< -

  main = do 
fileContents< - readFileuniprot_sprot.fasta


$ b $ parseSprot 函数不会返回 IO 中的结果,您可以在任何地方使用。在 do notation中,我们通过使用不同的语法告诉绑定到变量的结果和声明之间的区别。 x < - ... 将结果绑定到变量。 let x = ... 声明 x 无论在右侧。

  main = do 
fileContents< - readFileuniprot_sprot.fasta
let parsedContents = parseSprot fileContents

要测试解析器在做什么,您可能需要 print the从 parse 中返回的值。

  main = do 
fileContents < - readFileuniprot_sprot.fasta
let parsedContents = parseSprot fileContents
print parsedContents

如果没有 do 表示法,您可以将其写为

  main = readFileuniprot_sprot.fasta>> =打印。 parseSprot 

>> = 第一次计算的结果并将其输入函数以决定接下来要做什么。

I am a newbie to Haskell, but it seems like a powerful language that I want to learn. I was adopting some code from the chapter in real world Haskell on parsec. I tried to make my own version of it parsing the content of a uniprot-file. This is a file that consists of records (that starts with ">"), and where each record consists of lines. My code seems very close to what is done in the example, but I am getting a lot of errors - mostly on types. My exception is among other that I am taking the output of readFile (IO string) instead of a string. I would appreciate it if someone could help me understand what is wrong in my approach...

import Text.ParserCombinators.Parsec

main:: IO()

parseSprot :: IO String -> Either ParseError [[String]]

parseSprot input = parse uniprotFile "(unknown)" input
    where   
        uniprotFile = endBy record eol
        record = sepBy lines (char '>')
        lines = many (noneOf ",\n")
        eol = char '\n'

main = do 
    parseSprot $ readFile "uniprot_sprot.fasta" 
    putStrLn "hey"

解决方案

parseSprot doesn't need an IO in its signature.

parseSprot :: String -> Either ParseError [[String]]
...

The result of readFile is an IO String. You can do something with this String by binding the result of the readFile action into a new IO action. In do notation you can bind the result to a variable with <-

main = do 
    fileContents <- readFile "uniprot_sprot.fasta"

The parseSprot function doesn't return a result in IO, you can use it anywhere. In do notation we tell the difference between a result bound to a variable and a declaration by using different syntax. x <- ... binds a result to a variable. let x = ... declares x to be whatever is on the right hand side.

main = do 
    fileContents <- readFile "uniprot_sprot.fasta" 
    let parsedContents = parseSprot fileContents

To test what your parser is doing, you might want to print the value returned from parse.

main = do 
    fileContents <- readFile "uniprot_sprot.fasta" 
    let parsedContents = parseSprot fileContents
    print parsedContents

Without do notation you can write this as

main = readFile "uniprot_sprot.fasta" >>= print . parseSprot

>>= takes the result of the first computation and feeds it into a function to decide what to do next.

这篇关于如何用parsec解析uniprot文件?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆