在Haskell中,如何从XML文档中提取字符串? [英] In Haskell how do you extract strings from an XML document?
问题描述
如果我有这样的XML文档:
< root>
< elem name =问候语>
您好
< / elem>
< elem name =Name>
名称
< / elem>
< / root>
以及一些像这样的Haskell类型/数据定义:
type Name = String
type Value = String
data LocalizedString = LS Name Value
我想用以下签名编写一个Haskell函数:
getLocalizedStrings :: String - > [LocalizedString]
其中第一个参数是XML文本,返回的值是:
[LSGreetingHello,LSNameName]
我该怎么做?
如果HaXml是最好的工具,我会如何使用HaXml以达到上述目标?
谢谢!
我从来没有真正想过弄清楚如何使用 HaXML ; HXT 满足了我所有的需求。
{ - #LANGUAGE箭头# - }
导入Data.Maybe
导入Text.XML.HXT.Arrow
type Name = String
type Value = String
data LocalizedString = LS Name Value
getLocalizedStrings :: String - >也许[LocalizedString]
getLocalizedStrings =(。)listToMaybe。 runLA $ xread>>> getRoot
atTag :: ArrowXml a =>字符串 - > XmlTree XmlTree
atTag tag = deep $ isElem>>> hasName标记
getRoot :: ArrowXml a => XmlTree [LocalizedString]
getRoot = atTagroot>>> listA getElem
getElem :: ArrowXml a => XmlTree LocalizedString
getElem = atTagelem>>> proc x - > do
name< - getAttrValuename - < x
值< - getChildren>>> getText - < x
returnA - < LS名称值
您可能想要多一点错误检查(即不要只是懒惰地使用 atTag
像我一样;实际验证< root>
是root,< elem> ;
是直接派生的,等等),但是在你的例子中这很好。
现在,如果您需要 Arrow 的介绍,不幸的是我不知道有什么好的。我自己学会了投入大海学习如何游泳的方式。
有些事情可能有助于记住, proc
/ - <
语法对于基本箭头操作( arr
,>>>
等),就像 do
/ -
仅仅是基本monad操作的糖( return
,>> =
等)。以下是相同的:
getAttrValuename&&& (getChildren>>> getText)>>< uncurry LS
proc x - > do
name< - getAttrValuename - < x
值< - getChildren>>> getText - < x
returnA - < LS名称值
If I have an XML document like this:
<root>
<elem name="Greeting">
Hello
</elem>
<elem name="Name">
Name
</elem>
</root>
and some Haskell type/data definitions like this:
type Name = String
type Value = String
data LocalizedString = LS Name Value
and I wanted to write a Haskell function with the following signature:
getLocalizedStrings :: String -> [LocalizedString]
where the first parameter was the XML text, and the returned value was:
[LS "Greeting" "Hello", LS "Name" "Name"]
how would I do this?
If HaXml is the best tool, how would I use HaXml to achieve the above goal?
Thank!
I've never actually bothered to figure out how to extract bits out of XML documents using HaXML; HXT has met all my needs.
{-# LANGUAGE Arrows #-}
import Data.Maybe
import Text.XML.HXT.Arrow
type Name = String
type Value = String
data LocalizedString = LS Name Value
getLocalizedStrings :: String -> Maybe [LocalizedString]
getLocalizedStrings = (.) listToMaybe . runLA $ xread >>> getRoot
atTag :: ArrowXml a => String -> a XmlTree XmlTree
atTag tag = deep $ isElem >>> hasName tag
getRoot :: ArrowXml a => a XmlTree [LocalizedString]
getRoot = atTag "root" >>> listA getElem
getElem :: ArrowXml a => a XmlTree LocalizedString
getElem = atTag "elem" >>> proc x -> do
name <- getAttrValue "name" -< x
value <- getChildren >>> getText -< x
returnA -< LS name value
You'd probably like a little more error-checking (i.e. don't just lazily use atTag
like me; actually verify that <root>
is root, <elem>
is direct descendent, etc.) but this works just fine on your example.
Now, if you need an introduction to Arrows, unfortunately I don't know of any good one. I myself learned it the "thrown into the ocean to learn how to swim" way.
Something that may be helpful to keep in mind is that the proc
/-<
syntax is simply sugar for the basic arrow operations (arr
, >>>
, etc.), just like do
/<-
is simply sugar for the basic monad operations (return
, >>=
, etc.). The following are equivalent:
getAttrValue "name" &&& (getChildren >>> getText) >>^ uncurry LS
proc x -> do
name <- getAttrValue "name" -< x
value <- getChildren >>> getText -< x
returnA -< LS name value
这篇关于在Haskell中,如何从XML文档中提取字符串?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!