我如何尽早完成阻止回归? [英] How do I make a do block return early?

查看:76
本文介绍了我如何尽早完成阻止回归?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我试图使用Haskell来抓取网页,并将结果编译成一个对象。

如果无论什么原因,我无法从页面中获取所有项目,我想停止尝试处理页面并尽早返回。



例如:

  scrapePage :: String  - > IO()
scrapePage url = do
doc< - fromUrl url
title< - liftM headMay $ runX $ doc>>> csshead.title>>> getText
when(isNothing title)(return())
date< - liftM headMay $ runX $ doc>>> cssspan.dateTime! data-utc
when(isNothing date)(return())
- etc
- 创建页面对象并将其发送到db
return()

问题在于在不停止执行时阻止或阻止执行其他部分。



正确的做法是什么?

解决方案

return 在haskell中的作用与 return 在其他语言中。相反, return 的作用是向monad中注入一个值(在本例中为 IO )。你有几个选择最简单的就是使用if

  scrapePage :: String  - > IO()
scrapePage url = do
doc< - fromUrl url
title< - liftM headMay $ runX $ doc>>> csshead.title>>> getText
if(isNothing title)then return()else do
date< - liftM headMay $ runX $ doc>>> cssspan.dateTime! data-utc
if(isNothing date)then return()else do $ b $ - etc
- 创建页面对象并发送给db
return()

另一个选择是使用,除非

  scrapePage url = do 
doc< - fromUrl url
title< - liftM headMay $ runX $ doc> >> csshead.title>>> getText
除非(isNothing title)do
date< - liftM headMay $ runX $ doc>>> cssspan.dateTime! data-utc
除非(是无日期)做
- etc
- 制作页面对象并发送给db
return()

这里的一般问题是 IO monad没有控制效果(例外情况除外)。另一方面,你可以使用或许monad变换器

  scrapePage url = liftM(maybe()id)。 runMaybeT $ do 
doc< - liftIO $ fromUrl url
title< - liftIO $ liftM headMay $ runX $ doc>>> csshead.title>>> getText
guard(isJust title)
date< - liftIO $ liftM headMay $ runX $ doc>>> cssspan.dateTime! data-utc
guard(isJust date)
- etc
- 创建页面对象并将其发送到db
return()
ContT

/ p>

  scrapePage :: String  - > IO()
scrapePage url = runContT return $ do
doc< - fromUrl url
title< - liftM headMay $ runX $ doc>>> csshead.title>>> (无标题)$ callCC($())
日期< - liftM headMay $ runX $ doc>>>时,getText
。 cssspan.dateTime! data-utc
当(isNothing date)$ callCC($())
- etc
- 创建页面对象并发送给db
return()

警告:以上所有代码都未经过测试,甚至没有经过检查!


I'm trying to scrape for a webpage using Haskell and compile the results into an object.

If, for whatever reason, I can't get all the items from the pages, I want to stop trying to process the page and return early.

For example:

scrapePage :: String -> IO ()
scrapePage url = do
  doc <- fromUrl url
  title <- liftM headMay $ runX $ doc >>> css "head.title" >>> getText
  when (isNothing title) (return ())
  date <- liftM headMay $ runX $ doc >>> css "span.dateTime" ! "data-utc"
  when (isNothing date) (return ())
  -- etc
  -- make page object and send it to db
  return ()

The problem is the when doesn't stop the do block or keep the other parts from being executed.

What is the right way to do this?

解决方案

return in haskell does not do the same thing as return in other languages. Instead, what return does is to inject a value into a monad (in this case IO). You have a couple of options

the most simple is to use if

scrapePage :: String -> IO ()
scrapePage url = do
  doc <- fromUrl url
  title <- liftM headMay $ runX $ doc >>> css "head.title" >>> getText
  if (isNothing title) then return () else do
   date <- liftM headMay $ runX $ doc >>> css "span.dateTime" ! "data-utc"
   if (isNothing date) then return () else do
     -- etc
     -- make page object and send it to db
     return ()

another option is to use unless

scrapePage url = do
  doc <- fromUrl url
  title <- liftM headMay $ runX $ doc >>> css "head.title" >>> getText
  unless (isNothing title) do
    date <- liftM headMay $ runX $ doc >>> css "span.dateTime" ! "data-utc"
    unless (isNothing date) do
      -- etc
      -- make page object and send it to db
      return ()

the general problem here is that the IO monad doesn't have control effects (except for exceptions). On the other hand, you could use the maybe monad transformer

scrapePage url = liftM (maybe () id) . runMaybeT $ do
  doc <- liftIO $ fromUrl url
  title <- liftIO $ liftM headMay $ runX $ doc >>> css "head.title" >>> getText
  guard (isJust title)
  date <- liftIO $ liftM headMay $ runX $ doc >>> css "span.dateTime" ! "data-utc"
  guard (isJust date)
  -- etc
  -- make page object and send it to db
  return ()

if you really want to get full blown control effects you need to use ContT

scrapePage :: String -> IO ()
scrapePage url = runContT return $ do
  doc <- fromUrl url
  title <- liftM headMay $ runX $ doc >>> css "head.title" >>> getText
  when (isNothing title) $ callCC ($ ())
  date <- liftM headMay $ runX $ doc >>> css "span.dateTime" ! "data-utc"
  when (isNothing date) $ callCC ($ ())
  -- etc
  -- make page object and send it to db
  return ()

WARNING: none of the above code has been tested, or even type checked!

这篇关于我如何尽早完成阻止回归?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆