在Haskell中,如何从XML文档中提取字符串? [英] In Haskell how do you extract strings from an XML document?

查看:108
本文介绍了在Haskell中,如何从XML文档中提取字符串?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

如果我有这样的XML文档:

 < root> 
< elem name =问候语>
您好
< / elem>
< elem name =Name>
名称
< / elem>
< / root>

以及一些像这样的Haskell类型/数据定义:

  type Name = String 
type Value = String
data LocalizedString = LS Name Value

我想用以下签名编写一个Haskell函数:

  getLocalizedStrings :: String  - > [LocalizedString] 

其中第一个参数是XML文本,返回的值是:

  [LSGreetingHello,LSNameName] 

我该怎么做?

如果HaXml是最好的工具,我会如何使用HaXml以达到上述目标?

谢谢!

解决方案

我从来没有真正想过弄清楚如何使用 HaXML ; HXT 满足了我所有的需求。

  { - #LANGUAGE箭头# - } 
导入Data.Maybe
导入Text.XML.HXT.Arrow

type Name = String
type Value = String
data LocalizedString = LS Name Value

getLocalizedStrings :: String - >也许[LocalizedString]
getLocalizedStrings =(。)listToMaybe。 runLA $ xread>>> getRoot

atTag :: ArrowXml a =>字符串 - > XmlTree XmlTree
atTag tag = deep $ isElem>>> hasName标记

getRoot :: ArrowXml a => XmlTree [LocalizedString]
getRoot = atTagroot>>> listA getElem

getElem :: ArrowXml a => XmlTree LocalizedString
getElem = atTagelem>>> proc x - > do
name< - getAttrValuename - < x
值< - getChildren>>> getText - < x
returnA - < LS名称值

您可能想要多一点错误检查(即不要只是懒惰地使用 atTag 像我一样;实际验证< root> 是root,< elem> ; 是直接派生的,等等),但是在你的例子中这很好。






现在,如果您需要 Arrow 的介绍,不幸的是我不知道有什么好的。我自己学会了投入大海学习如何游泳的方式。

有些事情可能有助于记住, proc / - < 语法对于基本箭头操作( arr >>> 等),就像 do / - 仅仅是基本monad操作的糖( return >> = 等)。以下是相同的:

  getAttrValuename&&& (getChildren>>> getText)>>< uncurry LS 

proc x - > do
name< - getAttrValuename - < x
值< - getChildren>>> getText - < x
returnA - < LS名称值


If I have an XML document like this:

<root>
  <elem name="Greeting">
    Hello
  </elem>
  <elem name="Name">
    Name
  </elem>
</root>

and some Haskell type/data definitions like this:

 type Name = String
 type Value = String
 data LocalizedString = LS Name Value

and I wanted to write a Haskell function with the following signature:

 getLocalizedStrings :: String -> [LocalizedString]

where the first parameter was the XML text, and the returned value was:

 [LS "Greeting" "Hello", LS "Name" "Name"]

how would I do this?

If HaXml is the best tool, how would I use HaXml to achieve the above goal?

Thank!

解决方案

I've never actually bothered to figure out how to extract bits out of XML documents using HaXML; HXT has met all my needs.

{-# LANGUAGE Arrows #-}
import Data.Maybe
import Text.XML.HXT.Arrow

type Name = String
type Value = String
data LocalizedString = LS Name Value

getLocalizedStrings :: String -> Maybe [LocalizedString]
getLocalizedStrings = (.) listToMaybe . runLA $ xread >>> getRoot

atTag :: ArrowXml a => String -> a XmlTree XmlTree
atTag tag = deep $ isElem >>> hasName tag

getRoot :: ArrowXml a => a XmlTree [LocalizedString]
getRoot = atTag "root" >>> listA getElem

getElem :: ArrowXml a => a XmlTree LocalizedString
getElem = atTag "elem" >>> proc x -> do
    name <- getAttrValue "name" -< x
    value <- getChildren >>> getText -< x
    returnA -< LS name value

You'd probably like a little more error-checking (i.e. don't just lazily use atTag like me; actually verify that <root> is root, <elem> is direct descendent, etc.) but this works just fine on your example.


Now, if you need an introduction to Arrows, unfortunately I don't know of any good one. I myself learned it the "thrown into the ocean to learn how to swim" way.

Something that may be helpful to keep in mind is that the proc/-< syntax is simply sugar for the basic arrow operations (arr, >>>, etc.), just like do/<- is simply sugar for the basic monad operations (return, >>=, etc.). The following are equivalent:

getAttrValue "name" &&& (getChildren >>> getText) >>^ uncurry LS

proc x -> do
    name <- getAttrValue "name" -< x
    value <- getChildren >>> getText -< x
    returnA -< LS name value

这篇关于在Haskell中,如何从XML文档中提取字符串?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆