HXT:在Haskell中按HXT的位置选择一个节点? [英] HXT: Select a node by position with HXT in Haskell?

查看:165
本文介绍了HXT:在Haskell中按HXT的位置选择一个节点?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我试图用Haskell解析一些XML文件。对于这份工作,我使用 HXT 获取有关现实应用程序中箭头的知识。所以我对箭头主题很陌生。



在XPath中(和 HaXml )可以按位置选择一个节点,比如说: / root / a [2] / b



我无法弄清楚如何用HXT做类似的事情,即使在阅读一次又一次的文档。



以下是我正在使用的示例代码:

  module Main where 

import Text.XML.HXT.Core

testXml :: String
testXml = unlines
[ <?xml version = \1.0 \?>>
,< root>
,< a>
,< b>第一个元素< / b>
,< b>第二个元素< / b>
,< / a>
,< a>
,< b>第三元素< / b>
,< / a>
,< a>
,< b>第四元素< / b>
,< b> ... ...< / b>
,< / a>
,< / root>
]

selector :: ArrowXml a =>一个XmlTree字符串
selector = getChildren /> isElem>>> hasNamea - 如何选择第二个< a> ;?
/> isElem>>> hasNameb
/> getText
$ b $ main :: IO()
main = do
let doc = readString [] testXml
nodes< - runX $ doc>>>选择器
mapM_ putStrLn节点

所需的输出为:

 第三元素

谢谢提前!

解决方案

我相信选择/ root / a [2] / b(全部为b里面有第二个a标签):

  selector :: ArrowXml a => Int  - > XmlTree String 
selector nth =
(getChildren /> isElem>>> hasNamea) - 需要括号!
> ;. (!! nth)
/> isElem>>> hasNameb/> getText

(结果是 [third element]

解释:正如我所见, class(...,ArrowList a,...)=> ArrowXml a ,所以 ArrowXml a ArrowList 的子类。看看 ArrowList 界面:

 (>> ;.): :abc  - > ([c]  - > [d]) - > a b d 
(> ;.):: a b c - > ([c] - > d) - > abd

so >>。 can使用一些提升的 [c] - >选择列表的子集。 [c] 和>。可以使用类型的提升函数从列表中选择单个项目[c] - > d 。因此,在选择了孩子并标记a过滤之后,让我们使用(!! nnth):: [a] - > a



有一件重要的事情要注意:

  infix 1>>> 
infix 5 />
中缀8> ;.

(所以我很难弄清楚为什么> ;。没有括号不会按预期工作)。因此, getChildren /> isElem>>> hasNamea必须包含在圆括号中。


I’m trying to parse some XML files with Haskell. For this job I’m using HXT to get some knowledge about arrows in real world applications. So I’m quite new to the arrow topics.

In XPath (and HaXml) it’s possible to select a node by position, let’s say: /root/a[2]/b

I can’t figure out how to do something like that with HXT, even after reading the documentation again and again.

Here is some sample code I’m working with:

module Main where

import Text.XML.HXT.Core

testXml :: String
testXml = unlines
    [ "<?xml version=\"1.0\"?>"
    , "<root>"
    , "    <a>"
    , "        <b>first element</b>"
    , "        <b>second element</b>"
    , "    </a>"
    , "    <a>"
    , "        <b>third element</b>"
    , "    </a>"
    , "    <a>"
    , "        <b>fourth element</b>"
    , "        <b>enough...</b>"
    , "    </a>"
    , "</root>"
    ]

selector :: ArrowXml a => a XmlTree String
selector = getChildren /> isElem >>> hasName "a" -- how to select second <a>?
                       /> isElem >>> hasName "b"
                       /> getText

main :: IO ()
main = do
    let doc = readString [] testXml
    nodes <- runX $ doc >>> selector
    mapM_ putStrLn nodes

The desired output would be:

third element

Thanks in advance!

解决方案

The solution which I believe selects "/root/a[2]/b" (all "b" tags inside second "a" tag):

selector :: ArrowXml a => Int -> a XmlTree String
selector nth =
    (getChildren /> isElem >>> hasName "a")   -- the parentheses required!
    >. (!! nth) 
    /> isElem >>> hasName "b" /> getText

(result is ["third element"]).

Explanation: As I see, class (..., ArrowList a, ...) => ArrowXml a, so ArrowXml a is a subclass for ArrowList. Looking through ArrowList interface:

(>>.) :: a b c -> ([c] -> [d]) -> a b d
(>.) :: a b c -> ([c] -> d) -> a b d

so >>. can select a subset of a list using some lifted [c] -> [d] and >. can select a single item from a list using a lifted function of type [c] -> d. So, after children are selected and tags "a" filtered, let's use (!! nth) :: [a] -> a.

There's an important thing to note:

infix 1 >>>
infix 5 />
infix 8 >.

(so I've had a hard time trying to figure out why >. without parentheses does not work as expected). Thus, getChildren /> isElem >>> hasName "a" must be wrapped in parentheses.

这篇关于HXT:在Haskell中按HXT的位置选择一个节点?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆