响应流昂贵的异步读取 [英] Expensive Asynchronous Reading of Response Stream

查看:142
本文介绍了响应流昂贵的异步读取的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我一直在努力学习F#在过去的几个日子,我一直运行到的东西,困扰我。我的学习计划是一些数据我是那种在操纵感兴趣的屏幕刮刀。

在F#PowerPack中有一个叫Stream.AsyncReadToEnd。我不想使用的PowerPack只是单一通话让我看了看他们是如何做到的。

 模块下载=
    开放系统
    开放System.IO
    开放System.Net
    System.Collections中开放    键入公共BulkDownload(uriList:IEnumerable的)=
        成员this.UriList以get()= uriList        成员this.ParalellDownload()=
            让下载(网址:URI)= {异步
                让UnblockViaNewThread F = {异步
                    做! Async.SwitchToNewThread()
                    让RES = F()
                    做! Async.SwitchToThreadPool()
                    返回水库}                让请求= HttpWebRequest.Create(URI)
                让!响应= request.AsyncGetResponse()
                使用responseStream = response.GetResponseStream()
                使用读卡器=新的StreamReader(responseStream)
                让!内容= UnblockViaNewThread(乐趣() - GT; reader.ReadToEnd())
                返回URI,contents.ToString()。长度}            this.UriList
            |> Seq.cast
            |> Seq.map下载
            |> Async.Parallel
            |> Async.RunSynchronously

他们有功能UnblockViaNewThread。那真的是异步读取响应流的唯一途径?是不是创建一个新的线程真的很贵(我见过〜1MB内存抛来抛去所有的地方)。有没有更好的方式来做到这一点?这是什么在每个异步* 呼叫真的happenening(一说我能让!)?

编辑:我跟托马斯的建议,实际上与一些独立的F#PowerTools的的走了过来。这里是。这确实需要错误处理,但它的异步请求,并下载一个网址为字节数组。

 命名空间下载
开放系统
开放System.IO
开放System.Net
System.Collections中开放键入公共BulkDownload(uriList:IEnumerable的)=
    成员this.UriList以get()= uriList    成员this.ParalellDownload()=
        让下载(网址:URI)= {异步
            让processStreamAsync(流:流)= {异步
                让我们的OutputStream =新的MemoryStream()
                让缓冲= Array.zeroCreate<位>为0x1000
                让完成= REF假
                而不是(!完成)做
                    让!读取动作= stream.AsyncRead(缓冲,0,0×1000)
                    如果读取动作= 0则
                        完成:=真
                    其他
                        outputStream.Write(缓冲液,0,读取动作)
                stream.Close()
                返回outputStream.ToArray()}            让请求= HttpWebRequest.Create(URI)
            让!响应= request.AsyncGetResponse()
            使用responseStream = response.GetResponseStream()
            让!内容= processStreamAsync responseStream
            返回URI,contents.Length}        this.UriList
        |> Seq.cast
        |> Seq.map下载
        |> Async.Parallel
        |> Async.RunSynchronously    覆盖this.ToString()=的string.join(,,this.UriList)


解决方案

我觉得 AsyncReadToEnd 只是同步调用 ReadToEnd的在一个单独的线程是错误的。

在F#PowerPack中还含有一种 AsyncStreamReader 包含正确的异步执行流读数。它有一个的ReadLine 方法(异步)返回下一行,仅下载从源流几大块(使用异步 ReadAsync ,而不是在后台线程上运行)。

 让processStreamAsync流= {异步
  使用asyncReader =新AsyncStreamReader(流)
  让完成= REF假
  而不是(!完成)做
    // Asynchrnously获得下一行
    让! nextLine = asyncReader.ReadLine()
    如果nextLine = null,则完成:= TRUE
    其他
       (*处理下一行*)}

如果您想要下载整个内容作为字符串(而不是处理它行由行),那么你可以使用 ReadToEnd的 <$ C $的方法C> AsyncStreamReader 。这是一个正确的异步实现,开始下载数据(异步),并重复该块不阻塞。

 异步{
  使用asyncReader =新AsyncStreamReader(流)
  返回! asyncReader.ReadToEnd()}

此外,F#PowerPack的是开放式souorce并拥有许可执照,因此使用它的最好办法就是经常只复制你需要到项目的一些文件。

I have been trying to learn F# for the past couple of day and I keep running into something that perplexes me. My "learning project" is a screen scraper for some data I'm kind of interested in manipulating.

In F# PowerPack there is a call Stream.AsyncReadToEnd. I did not want to use the PowerPack just for that single call so I took a look at how they did it.

module Downloader =
    open System
    open System.IO
    open System.Net
    open System.Collections

    type public BulkDownload(uriList : IEnumerable) =
        member this.UriList with get() = uriList

        member this.ParalellDownload() =
            let Download (uri : Uri) = async {
                let UnblockViaNewThread f = async {
                    do! Async.SwitchToNewThread()
                    let res = f()
                    do! Async.SwitchToThreadPool()
                    return res }

                let request = HttpWebRequest.Create(uri)
                let! response = request.AsyncGetResponse()
                use responseStream = response.GetResponseStream()
                use reader = new StreamReader(responseStream)
                let! contents = UnblockViaNewThread (fun() -> reader.ReadToEnd())
                return uri, contents.ToString().Length }

            this.UriList
            |> Seq.cast
            |> Seq.map Download
            |> Async.Parallel
            |> Async.RunSynchronously

They have that function UnblockViaNewThread. Is that really the only way to asynchronously read the response stream? Isn't creating a new thread really expensive (I've seen the "~1mb of memory" thrown around all over the place). Is there a better way to do this? Is this what's really happenening in every Async* call (one that I can let!)?

EDIT: I follow Tomas' suggestions and actually came up with something independent of F# PowerTools. Here it is. This really needs error handling, but it asynchronous requests and downloads a url to a byte array.

namespace Downloader
open System
open System.IO
open System.Net
open System.Collections

type public BulkDownload(uriList : IEnumerable) =
    member this.UriList with get() = uriList

    member this.ParalellDownload() =                
        let Download (uri : Uri) = async {
            let processStreamAsync (stream : Stream) = async { 
                let outputStream = new MemoryStream()
                let buffer = Array.zeroCreate<byte> 0x1000
                let completed = ref false
                while not (!completed) do
                    let! bytesRead = stream.AsyncRead(buffer, 0, 0x1000)
                    if bytesRead = 0 then
                        completed := true
                    else
                        outputStream.Write(buffer, 0, bytesRead)
                stream.Close()
                return outputStream.ToArray() }

            let request = HttpWebRequest.Create(uri)
            let! response = request.AsyncGetResponse()
            use responseStream = response.GetResponseStream()
            let! contents = processStreamAsync responseStream
            return uri, contents.Length }

        this.UriList
        |> Seq.cast
        |> Seq.map Download
        |> Async.Parallel
        |> Async.RunSynchronously

    override this.ToString() = String.Join(", ", this.UriList)

解决方案

I think that AsyncReadToEnd that just synchronously calls ReadToEnd on a separate thread is wrong.

The F# PowerPack also contains a type AsyncStreamReader that contains proper asynchronous implementation of stream reading. It has a ReadLine method that (asynchronously) returns the next line and only downloads a few chunks from the source stream (using the asynchronous ReadAsync as opposed to running on a background thread).

let processStreamAsync stream = async { 
  use asyncReader = new AsyncStreamReader(stream)
  let completed = ref false
  while not (!completed) do 
    // Asynchrnously get the next line
    let! nextLine = asyncReader.ReadLine()
    if nextLine = null then completed := true
    else
       (* process the next line *)  }

If you want to download the whole content as a string (instead of processing it line-by-line), then you can use ReadToEnd method of AsyncStreamReader. This is a proper asynchronous implementation that starts downloading block of data (asynchronously) and repeats this without blocking.

async { 
  use asyncReader = new AsyncStreamReader(stream)
  return! asyncReader.ReadToEnd() }

Also, F# PowerPack is open-souorce and has permissive license, so the best way to use it is often to just copy the few files you need into your project.

这篇关于响应流昂贵的异步读取的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆