Haskell HXT用于提取值列表 [英] Haskell HXT for extracting a list of values

查看:85
本文介绍了Haskell HXT用于提取值列表的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我试图通过XPath和箭头来同时处理HXT,而我完全停留在如何解决这个问题上。我有以下HTML:

 < div> 
< div class =c1> a< / div>
< div class =c2> b< / div>
< div class =c3> 123< / div>
< div class =c4> 234< / div>
< / div>

我已经提取到HXT XmlTree中。我想要做的是定义一个函数(我认为?):

  getValues :: [String]  - > IOSArrow Xmltree [(String,String)] 

如果用作 getValues [c1,c2,c3,c4] ,会让我:

  [(c1,a),(c2,b),(c3,123),(c4,234)] code> 

请帮忙吗?

解决方案

这里有一种方法(我的类型更通用一些,我没有使用XPath):

  { - #LANGUAGE箭头# - } 
模块Main其中

将限定的Data.Map导入为M
import Text.XML.HXT.Arrow

classes ::(ArrowXml a)=> XmlTree(M.Map String String)
classes = listA(divs>>> divs>>>对)>>> arr M.fromList
其中
divs = getChildren>>> hasNamediv
pairs = proc div - > do
cls< - getAttrValueclass - < div
val< - deep getText - < div
returnA - < (cls,val)

getValues ::(ArrowXml a)=> [字符串] - >一个XmlTree [(String,Maybe String)]
getValues cs = classes>>> arr(zip cs。lookupValues cs)
where lookupValues cs m = map(flip M.lookup m)cs

main = do
let xml =< div><< ; div class ='c1'> a< / div>< div class ='c2'> b< / div> \
\< div class ='c3'> 123& div>< div class ='c4'> 234< / div>< / div>

print =<< runX(readString [] xml>>> getValues [c1,c2,c3,c4])

我可能会运行一个箭头来获取地图,然后执行查找,但这种方式也可以。




回答关于 listA 的问题: divs>>> divs>>>对是一个类型为的列表箭头XmlTree(String,String) --ie,它是一个非确定性计算,它接受一个XML树并返回字符串对。

arr M.fromList 类型为 a [(String,String) ](M.Map字符串字符串)。这意味着我们不能用 divs>>>来组合它。 divs>>>对,因为类型不匹配。



listA 解决了这个问题:它折叠 divs>>> divs>>>对转换为确定性版本,其类型为 XmlTree [(String,String)] ,这正是我们所需要的。


I'm trying to figure my way through HXT with XPath and arrows at the same time and I'm completely stuck on how to think through this problem. I've got the following HTML:

<div>
<div class="c1">a</div> 
<div class="c2">b</div> 
<div class="c3">123</div> 
<div class="c4">234</div> 
</div>

which I've extracted into an HXT XmlTree. What I'd like to do is define a function (I think?):

getValues :: [String] -> IOSArrow Xmltree [(String, String)]

Which, if used as getValues ["c1", "c2", "c3", "c4"], will get me:

[("c1", "a"), ("c2", "b"), ("c3", "123"), ("c4", "234")]

Help please?

解决方案

Here's one approach (my types are a bit more general and I'm not using XPath):

{-# LANGUAGE Arrows #-}
module Main where

import qualified Data.Map as M
import Text.XML.HXT.Arrow

classes :: (ArrowXml a) => a XmlTree (M.Map String String)
classes = listA (divs >>> divs >>> pairs) >>> arr M.fromList
  where
    divs = getChildren >>> hasName "div"
    pairs = proc div -> do
      cls <- getAttrValue "class" -< div
      val <- deep getText         -< div
      returnA -< (cls, val)

getValues :: (ArrowXml a) => [String] -> a XmlTree [(String, Maybe String)]
getValues cs = classes >>> arr (zip cs . lookupValues cs)
  where lookupValues cs m = map (flip M.lookup m) cs

main = do
  let xml = "<div><div class='c1'>a</div><div class='c2'>b</div>\
            \<div class='c3'>123</div><div class='c4'>234</div></div>"

  print =<< runX (readString [] xml >>> getValues ["c1", "c2", "c3", "c4"])

I would probably run an arrow to get the map and then do the lookups, but this way works as well.


To answer your question about listA: divs >>> divs >>> pairs is a list arrow with type a XmlTree (String, String)—i.e., it's a non-deterministic computation that takes an XML tree and returns string pairs.

arr M.fromList has type a [(String, String)] (M.Map String String). This means we can't just compose it with divs >>> divs >>> pairs, since the types don't match up.

listA solves this problem: it collapses divs >>> divs >>> pairs into a deterministic version with type a XmlTree [(String, String)], which is exactly what we need.

这篇关于Haskell HXT用于提取值列表的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆