从xml-conduit获取所有名称 [英] Get all Names from xml-conduit
问题描述
我从 http://hackage.haskell.org/package/xml-conduit-1.1.0.9/docs/Text-XML-Stream-Parse.html
下面是它的样子:
<?xml version =1.0encoding =utf-8 ?>
< population xmlns:xsd =http://www.w3.org/2001/XMLSchemaxmlns:xsi =http://www.w3.org/2001/XMLSchema-instancexmlns = http://example.com>
< success> true< / success>
< row_count> 2< / row_count>
< summary>
<香蕉> 0< /香蕉>
< / summary>
<人>
< person>
< firstname> Michael< / firstname>
< age> 25< / age>
< / person>
< person>
< firstname> Eliezer< / firstname>
< age> 2< / age>
< / person>
< / people>
< / population>
如何获得 firstname
和 age
为每个人?
我的目标是使用http-conduit下载这个xml然后解析它,但我正在寻找一个解决方案,以便在没有属性的情况下解析(使用tagNoAttrs?)
这是我尝试过的,并且添加了我的Haskell评论中的问题:
{ - #LANGUAGE OverloadedStrings# - }
import Control.Monad.Trans.Resource
import Data.Conduit(($$))
import Data.Text(Text,unpack)
import Text.XML.Stream.Parse
import Control.Applicative((< ; *))
data Person = Person Int Text
导出显示
- 是否需要将lambda函数\age更改为其他值才能获得姓名和年龄?
parsePerson = tagNoAttrperson$ \age - >做
名称< - 内容 - 如何从内容中获得年龄? unpack用于属性
返回$ Person年龄名称
parsePeople = tagNoAttrpeople$ many parsePerson
- 这不会忽略xmlns属性
parsePopulation = tagNamepopulation(optionalAttrxmlns< * ignoreAttrs)$ parsePeople
main = do
people< - runResourceT $
parseFile def people2.xml$$ parsePopulation
print people
<首先:解析xml-conduit中的组合器在一段时间内没有更新,并显示它们的年龄。我建议大多数人使用DOM或游标界面。这就是说,让我们看看你的例子。你的代码有两个问题:
http://example.com
命名空间中,并且您的代码需要反映它。
{ - #LANGUAGE OverloadedStrings# - }
import Control.Monad.Trans.Resource(runResourceT)
import Data.Conduit(Consumer,($$))
import Data.Text(Text)
import Data.Text.Read(decimal)
import Data.XML.Types(Event)
import Text.XML.Stream.Parse
data Person = Person Int Text
导出显示
- 是否需要更改lambda函数\\为了获得名称和年龄,还需要别的东西?
parsePerson :: MonadThrow m =>消费者事件m(可能是人)
parsePerson = tagNoAttr{http://example.com} person$ do
name< - force名字标记缺失$ tagNoAttr{http:// example.com}名字内容
ageText< - 强制缺少时间标记$ tagNoAttr{http://example.com}年龄内容
小数
小数ageText右(年龄,) - >返回$人年龄名称
_ - >强制无效年龄值$ return Nothing
parsePeople :: MonadThrow m => Consumer Event m [Person]
parsePeople = forceno people tag$ do
_< - tagNoAttr{http://example.com} successcontent
_< - tagNoAttr{http://example.com} row_countcontent
_< - tagNoAttr{http://example.com}摘要$
tagNoAttr{http://example.com } bananascontent
tagNoAttr{http://example.com} people$ many parsePerson
- 这不会忽略xmlns属性
parsePopulation :: MonadThrow m =>消费者事件m [Person]
parsePopulation =强制缺少人口标签$
tagName{http://example.com} populationignoreAttrs $ \() - > parsePeople
$ b $ main main :: IO()
main = do
people< - runResourceT $
parseFile defpeople2.xml$$ parsePopulation
打印人员
以下是使用游标API的示例。请注意,它具有不同的错误处理特性,但对于格式良好的输入应该产生相同的结果。
{ - #LANGUAGE OverloadedStrings # - }
导入Text.XML
导入Text.XML.Cursor
导入Data.Text(文本)
导入Data.Text.Read(十进制)
导入Data.Monoid(mconcat)
main :: IO()
main = do
doc< - Text.XML.readFile defpeople2.xml
let cursor = fromDocument doc
print $ cursor $ // element{http://example.com} person> => parsePerson
data Person = Person Int Text
派生Show
parsePerson :: Cursor - > [Person]
parsePerson c = do
let name = c $ / element{http://example.com} firstname& / content
ageText = c $ / element{ http://example.com} age& / content
case decimal $ mconcat ageText
Right(age,) - > [人年龄$ mconcat名称]
_ - > []
I'm parsing a modified XML from http://hackage.haskell.org/package/xml-conduit-1.1.0.9/docs/Text-XML-Stream-Parse.html
Here's what it looks like:
<?xml version="1.0" encoding="utf-8"?>
<population xmlns:xsd="http://www.w3.org/2001/XMLSchema" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns="http://example.com">
<success>true</success>
<row_count>2</row_count>
<summary>
<bananas>0</bananas>
</summary>
<people>
<person>
<firstname>Michael</firstname>
<age>25</age>
</person>
<person>
<firstname>Eliezer</firstname>
<age>2</age>
</person>
</people>
</population>
How do I get a list of firstname
and age
for every person?
My goal is to use http-conduit to download this xml and then parse it, but I am looking for a solution on how to parse when there are no attributes (use tagNoAttrs?)
Here's what I've tried, and I've added my questions in the Haskell comments:
{-# LANGUAGE OverloadedStrings #-}
import Control.Monad.Trans.Resource
import Data.Conduit (($$))
import Data.Text (Text, unpack)
import Text.XML.Stream.Parse
import Control.Applicative ((<*))
data Person = Person Int Text
deriving Show
-- Do I need to change the lambda function \age to something else to get both name and age?
parsePerson = tagNoAttr "person" $ \age -> do
name <- content -- How do I get age from the content? "unpack" is for attributes
return $ Person age name
parsePeople = tagNoAttr "people" $ many parsePerson
-- This doesn't ignore the xmlns attributes
parsePopulation = tagName "population" (optionalAttr "xmlns" <* ignoreAttrs) $ parsePeople
main = do
people <- runResourceT $
parseFile def "people2.xml" $$ parsePopulation
print people
Firstly: parsing combinators in xml-conduit haven't been updated in quite a while, and show their age. I recommend most people to use the DOM or cursor interface instead. That said, let's look at your example. There are two problems with your code:
- It doesn't properly handle XML namespaces. All of the element names are in the
http://example.com
namespace, and your code needs to reflect that. - The parsing combinators demand that you account for all elements. They won't automatically skip over some elements for you.
So here's an implementation using the streaming API that gets the desired result:
{-# LANGUAGE OverloadedStrings #-}
import Control.Monad.Trans.Resource (runResourceT)
import Data.Conduit (Consumer, ($$))
import Data.Text (Text)
import Data.Text.Read (decimal)
import Data.XML.Types (Event)
import Text.XML.Stream.Parse
data Person = Person Int Text
deriving Show
-- Do I need to change the lambda function \age to something else to get both name and age?
parsePerson :: MonadThrow m => Consumer Event m (Maybe Person)
parsePerson = tagNoAttr "{http://example.com}person" $ do
name <- force "firstname tag missing" $ tagNoAttr "{http://example.com}firstname" content
ageText <- force "age tag missing" $ tagNoAttr "{http://example.com}age" content
case decimal ageText of
Right (age, "") -> return $ Person age name
_ -> force "invalid age value" $ return Nothing
parsePeople :: MonadThrow m => Consumer Event m [Person]
parsePeople = force "no people tag" $ do
_ <- tagNoAttr "{http://example.com}success" content
_ <- tagNoAttr "{http://example.com}row_count" content
_ <- tagNoAttr "{http://example.com}summary" $
tagNoAttr "{http://example.com}bananas" content
tagNoAttr "{http://example.com}people" $ many parsePerson
-- This doesn't ignore the xmlns attributes
parsePopulation :: MonadThrow m => Consumer Event m [Person]
parsePopulation = force "population tag missing" $
tagName "{http://example.com}population" ignoreAttrs $ \() -> parsePeople
main :: IO ()
main = do
people <- runResourceT $
parseFile def "people2.xml" $$ parsePopulation
print people
Here's an example using the cursor API. Note that it has different error handling characteristics, but should produce the same result for well-formed input.
{-# LANGUAGE OverloadedStrings #-}
import Text.XML
import Text.XML.Cursor
import Data.Text (Text)
import Data.Text.Read (decimal)
import Data.Monoid (mconcat)
main :: IO ()
main = do
doc <- Text.XML.readFile def "people2.xml"
let cursor = fromDocument doc
print $ cursor $// element "{http://example.com}person" >=> parsePerson
data Person = Person Int Text
deriving Show
parsePerson :: Cursor -> [Person]
parsePerson c = do
let name = c $/ element "{http://example.com}firstname" &/ content
ageText = c $/ element "{http://example.com}age" &/ content
case decimal $ mconcat ageText of
Right (age, "") -> [Person age $ mconcat name]
_ -> []
这篇关于从xml-conduit获取所有名称的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!