如何在服务器端实时读取和回显正在上传的文件的文件大小,而不会在服务器端和客户端都被阻塞? [英] How to read and echo file size of uploaded file being written at server in real time without blocking at both server and client?

查看:723
本文介绍了如何在服务器端实时读取和回显正在上传的文件的文件大小,而不会在服务器端和客户端都被阻塞?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

问题:如何在服务器上实时读取和回显正在上传的文件的文件大小,而不会在服务器和客户端都被阻塞?$ p

b
$ b

上下文:

POST写入文件上载的进度请求由 fetch(),其中 body 设置为 Blob File TypedArray ArrayBuffer 对象。

当前实现在 body 对象中设置 File 传递给第二个参数 fetch()



要求:

读取 echo 给客户端写入文件系统的文件的文件大小为 text / event-stream 。当作为查询字符串参数在 GET 请求中写入的所有字节作为变量提供给脚本时停止。当前文件的读取发生在一个单独的脚本环境中,其中 GET 调用应该读取文件的脚本在 POST 到将文件写入服务器的脚本。

尚未达到将文件写入服务器或读取文件以获取当前文件大小的潜在问题的错误处理,尽管这将是下一步<$ c $文件大小部分的c> echo 完成。



目前正试图使用​​ php 。虽然也对 c bash nodejs 感兴趣, code>蟒;或可用于执行相同任务的其他语言或方法。

客户端 javascript 部分不是问题。简单地说就是不用熟悉万维网上使用的最常用的服务器端语言之一 php 来实现这个模式,而不包含那些不必要的部分。



动机:

获取进度指标?



相关:

使用ReadableStream获取



问题:



获取

  PHP注意:未定义的索引:第7行的stream.php中的HTTP_LAST_EVENT_ID 

终端



另外,如果替换

  while(file_exists($ _ GET [filename])
&& amp; fileize($ _ GET [filename])< intval($ _ GET [filesize]))



$ p
$ b $ pre $ while(true)
pre>

会在 EventSource 中产生错误。

没有 sleep()调用,正确的文件大小调度为消息事件为 3.3MB 61921 26214 38093 时,分别上传相同的文件三次。预期的结果是当文件正在写入时文件的文件大小

  stream_copy_to_stream($ input,$ file); 

而不是上传的文件对象的文件大小。是 fopen() stream_copy_to_stream()阻止其他不同的 php 进程在 stream.php

迄今为止:

php 归因于到


  • 超越$ _POST,$ _GET和$ _FILE:在JavaScriptPHP中使用Blob

  • 使用PHP示例介绍服务器发送的事件


  • php

      //我们可以把`data.php`,`stream.php`合并到同一个文件吗? 
    //我们可以使用`STREAM_NOTIFY_PROGRESS`
    //表示流传输
    //在bytes_transferred和bytes_max中的当前进度以读取字节?
    //我们需要调用`stream_set_blocking`到`false`
    // data.php
    <?php

    $ filename = $ _SERVER [ HTTP_X_FILENAME];
    $ input = fopen(php:// input,rb);
    $ file = fopen($ filename,wb);
    stream_copy_to_stream($ input,$ file);
    fclose($ input);
    fclose($ file);
    回显上传。 $文件名。 成功;

    ?>

     <$ c (Content-Type:text / event-stream); $ c> // stream.php 
    <?php

    header
    头(Cache-Control:no-cache);
    header(Connection:keep-alive);
    // PHP注意:未定义索引:第7行stream.php中的HTTP_LAST_EVENT_ID?
    $ lastId = $ _SERVER [HTTP_LAST_EVENT_ID] || 0;
    if(isset($ lastId)&!empty($ lastId)&& is_numeric($ lastId)){
    $ lastId = intval($ lastId);
    $ lastId ++;
    }
    // else {
    // $ lastId = 0;
    //}

    //当前文件大小小于或等于
    // $ _GET [filesize]`$ _GET [filename ]`
    //如何仅当上面是`true`时循环
    while(true){
    $ upload = $ _GET [filename];
    //这是正确的函数和变量来使用
    //获取`stream_copy_to_stream($ input,$ file)的写入字节。
    $ data = filesize($ upload);
    // $ data = $ _GET [filename]。 。 $ _GET [ 文件大小];
    if($ data){
    sendMessage($ lastId,$ data);
    $ lastId ++;
    }
    // else {
    //关闭流
    //}
    //这里没有必要,尽管没有数千个消息事件
    //将被分派
    // sleep(1);

    $ b $函数sendMessage($ id,$ data){
    echoid:$ id\\\
    ;
    echodata:$ data \\\
    \\\
    ;
    ob_flush();
    flush();
    }
    ?>

    javascript

     <!DOCTYPE html> 
    < html>
    < head>
    < / head>
    < body>
    < input type =file>
    < script>
    $ b $ const [url,stream,header] = [data.php,stream.php,x-filename];
    $ b $常量[输入,进度,handleFile] = [
    document.querySelector(input [type = file])
    ,document.querySelector(progress)
    ,(event)=> {
    const [file] = input.files;
    const [{size:filesize,name:filename},headers,params] = [
    file,new Headers(),new URLSearchParams()
    ];
    //将`filename`,`filesize`作为`stream`的搜索参数URL
    Object.entries({filename,filesize})
    .forEach(([... props] )=> params.append.apply(params,props));
    //设置POST的标题
    headers.append(header,filename);
    // reset`progress.value`设置`progress.max`为`filesize`
    [progress.value,progress.max] = [0,filesize];
    const [request,source] = [
    new请求(url,{
    method:POST,headers:headers,body:file
    })
    / / / https://stackoverflow.com/a/42330433/
    ,新的EventSource(`$ {stream}?$ {params.toString()}`)
    ];
    source.addEventListener(message,(e)=> {
    // update`progress` here,
    //当`e.data = == filesize`
    //`progress.value = e.data`,应该是这样简单
    console.log(e.data,e.lastEventId);
    },true);

    source.addEventListener(open,(e)=> {
    console.log(fetch upload progress open);
    },true);

    source.addEventListener(error,(e)=> {
    console.error(fetch upload progress error);
    },true);
    //对测试进行完整性检查
    //在e.data ===文件大小时我们不需要`source`;
    //我们可以在`message`事件处理函数中调用`.close()`$ b $ setTimeout(()=> source.close(),30000);
    //我们不需要'源码'在'Promise'链中,
    //虽然我们可以解析'e.data ===文件大小'
    // //之前`响应,然后等待`.text()`;等等。
    // TODO:如果以及在哪里合并或分支`EventSource`,
    //`fetch`为单个或两个'Promise`链
    const upload = fetch(request);
    upload
    .then(response => response.text())
    .then(res => console.log(res))
    .catch(err => ; console.error(err));
    }
    ];

    input.addEventListener(change,handleFile,true);
    < / script>
    < / body>
    < / html>


    解决方案

    您需要 clearstatcache 来获得真实的文件大小。其他几位固定,你的stream.php可能如下所示:

     <?php 

    header(Content-Type:text / event-stream);
    头(Cache-Control:no-cache);
    header(Connection:keep-alive);
    //检查头是否已经发送,以避免`PHP注意:未定义的索引:HTTP_LAST_EVENT_ID在stream.php行`
    // php 7+
    // $ lastId = $ _SERVER [ HTTP_LAST_EVENT_ID]? 0;
    // php< 7
    $ lastId = isset($ _ SERVER [HTTP_LAST_EVENT_ID])? intval($ _ SERVER [HTTP_LAST_EVENT_ID]):0;

    $ upload = $ _GET [filename];
    $ data = 0;
    //如果文件已经存在,它的初始大小可以比新的大,所以我们需要忽略它
    $ wasLess = $ lastId!= 0;
    while($ data< $ _GET [filesize] ||!$ wasLess){
    //系统调用是昂贵的,并且正在被缓存,假定在大多数情况下文件统计信息不会经常改变
    //所以我们清除缓存以获得最新的数据
    clearstatcache(true,$ upload);
    $ data = filesize($ upload);
    $ wasLess | = $ data< $ _GET [ 文件大小];
    //不发送陈旧文件大小
    if($ wasLess){
    sendMessage($ lastId,$ data);
    $ lastId ++;

    //这里没有必要,虽然没有数千个'message'事件将被调度
    // sleep(1);
    //数以百万计的连接不良和大文件。 1秒可能太多,但每秒50条消息必须是好的
    usleep(20000);


    函数sendMessage($ id,$ data)
    {
    echoid:$ id\\\
    ;
    echodata:$ data \\\
    \\\
    ;
    ob_flush();
    //不需要flush()。它将块的内容长度添加到流
    // flush();
    }

    几点注意事项:



    安全。我的意思是说它的运气。据我了解这是一个概念的证明,安全是最少的担心,但免责声明应该在那里。这种方法存在根本上的缺陷,只有在您不关心DOS攻击或者关于您的文件的信息不存在的情况下才应该使用它。

    CPU。没有 usleep 脚本将会消耗100%的单个内核。在长时间的睡眠中,您可能会在一次迭代中上传整个文件,退出条件永远不会被满足。如果您在本地进行测试,应该完全删除 usleep ,因为在本地上传MB需要几毫秒的时间。



    打开连接。 apache和nginx / fpm都有可以处理请求的有限数量的php进程。单个文件上传将需要2上传文件所需的时间。由于带宽缓慢或伪造请求,这段时间可能会相当长,Web服务器可能会开始拒绝请求。

    客户部分。您需要分析回应,并最终在文件完全上传时停止收听活动。



    编辑:



    为了使产品变得更加友好,您将需要像redis或memcache这样的内存存储来存储文件元数据。



    发出帖子请求,添加一个唯一的标识标识文件和文件大小。



    在您的javascript:

      const fileId = Math.random()。toString(36).substr(2); //或者更加独特的任何东西
    ...

    const [request,source] = [
    new Request(`$ {url}?fileId = $ {fileId}& size = $ {filesize}`,{
    method:POST,headers:headers,body:file
    })
    ,new EventSource(`$ {stream}?fileId = $ { fileId}`)
    ];
    ....

    在data.php中注册令牌并通过块报告进度:

      .... 

    $ fileId = $ _GET ['fileId'];
    $ fileSize = $ _GET ['size'];

    setUnique($ fileId,0,$ fileSize);

    while($ uploaded = stream_copy_to_stream($ input,$ file,1024)){
    updateProgress($ id,$ uploaded);
    }
    ....


    / **
    *检查Id是否唯一,存储处理为0,full_size为$ size
    *为密钥设置合理的TTL,例如1小时
    *
    * @param字符串$ id
    * @param int $大小
    * @throws如果id不是唯一的,则为异常
    * /
    函数setUnique($ id,$ size){
    //使用您选择的存储器实现
    }
    $ b $ / **
    *更新给定文件的上传大小
    *
    * @param字符串$ id
    * @param int $处理
    * /
    函数updateProgress($ id,$ processed){
    //用你选择的存储来实现



    $ b所以你的stream.php不需要打到磁盘上,只要UX可以接受就睡觉:

      .... 
    list($ progress,$ size)= getProgress('non_existing_key_to_init_default_values');
    $ lastId = 0;
    $ b $($ progress< $ size){
    list($ progress,$ size)= getProgress($ _ GET [fileId]);
    sendMessage($ lastId,$ progress);
    $ lastId ++;
    sleep(1);
    }
    .....


    / **
    *获取文件上传的进度。
    *如果id不存在,则返回[0,PHP_INT_MAX]
    *
    * @param $ id
    * @return array $ bytesUploaded,$ fileSize
    * /
    函数getProgress($ id){
    //实现您选择的存储
    }

    2打开的连接的问题不能解决,除非你放弃EventSource旧的好拉。没有循环的stream.php的响应时间是毫秒级的事情,保持连接一直处于打开状态是非常浪费的,除非你每秒需要几百次更新。

    Question:

    How to read and echo file size of uploaded file being written at server in real time without blocking at both server and client?

    Context:

    Progress of file upload being written to server from POST request made by fetch(), where body is set to Blob, File, TypedArray, or ArrayBuffer object.

    The current implementation sets File object at body object passed to second parameter of fetch().

    Requirement:

    Read and echo to client the file size of file being written to filesystem at server as text/event-stream. Stop when all of the bytes, provided as a variable to the script as a query string parameter at GET request have been written. The read of the file currently takes place at a separate script environment, where GET call to script which should read file is made following POST to script which writes file to server.

    Have not reached error handling of potential issue with write of file to server or read of file to get current file size, though that would be next step once echo of file size portion is completed.

    Presently attempting to meet requirement using php. Though also interested in c, bash, nodejs, python; or other languages or approaches which can be used to perform same task.

    The client side javascript portion is not an issue. Simply not that versed in php, one of the most common server-side languages used at world wide web, to implement the pattern without including parts which are not necessary.

    Motivation:

    Progress indicators for fetch?

    Related:

    Fetch with ReadableStream

    Issues:

    Getting

    PHP Notice:  Undefined index: HTTP_LAST_EVENT_ID in stream.php on line 7
    

    at terminal.

    Also, if substitute

    while(file_exists($_GET["filename"]) 
      && filesize($_GET["filename"]) < intval($_GET["filesize"]))
    

    for

    while(true)
    

    produces error at EventSource.

    Without sleep() call, correct file size was dispatched to message event for a 3.3MB file, 3321824, was printed at console 61921, 26214, and 38093 times, respectively, when uploaded same file three times. The expected result is file size of file as the file is being written at

    stream_copy_to_stream($input, $file);
    

    instead of file size of uploaded file object. Are fopen() or stream_copy_to_stream() blocking as to other a different php process at stream.php?

    Tried so far:

    php is attributed to

    php

    // can we merge `data.php`, `stream.php` to same file?
    // can we use `STREAM_NOTIFY_PROGRESS` 
    // "Indicates current progress of the stream transfer 
    // in bytes_transferred and possibly bytes_max as well" to read bytes?
    // do we need to call `stream_set_blocking` to `false`
    // data.php
    <?php
    
      $filename = $_SERVER["HTTP_X_FILENAME"];
      $input = fopen("php://input", "rb");
      $file = fopen($filename, "wb"); 
      stream_copy_to_stream($input, $file);
      fclose($input);
      fclose($file);
      echo "upload of " . $filename . " successful";
    
    ?>
    

    // stream.php
    <?php
    
      header("Content-Type: text/event-stream");
      header("Cache-Control: no-cache");
      header("Connection: keep-alive");
      // `PHP Notice:  Undefined index: HTTP_LAST_EVENT_ID in stream.php on line 7` ?
      $lastId = $_SERVER["HTTP_LAST_EVENT_ID"] || 0;
      if (isset($lastId) && !empty($lastId) && is_numeric($lastId)) {
          $lastId = intval($lastId);
          $lastId++;
      }
      // else {
      //  $lastId = 0;
      // }
    
      // while current file size read is less than or equal to 
      // `$_GET["filesize"]` of `$_GET["filename"]`
      // how to loop only when above is `true`
      while (true) {
        $upload = $_GET["filename"];
        // is this the correct function and variable to use
        // to get written bytes of `stream_copy_to_stream($input, $file);`?
        $data = filesize($upload);
        // $data = $_GET["filename"] . " " . $_GET["filesize"];
        if ($data) {
          sendMessage($lastId, $data);
          $lastId++;
        } 
        // else {
        //   close stream 
        // }
        // not necessary here, though without thousands of `message` events
        // will be dispatched
        // sleep(1);
        }
    
        function sendMessage($id, $data) {
          echo "id: $id\n";
          echo "data: $data\n\n";
          ob_flush();
          flush();
        }
    ?>
    

    javascript

    <!DOCTYPE html>
    <html>
    <head>
    </head>
    <body>
    <input type="file">
    <progress value="0" max="0" step="1"></progress>
    <script>
    
    const [url, stream, header] = ["data.php", "stream.php", "x-filename"];
    
    const [input, progress, handleFile] = [
            document.querySelector("input[type=file]")
          , document.querySelector("progress")
          , (event) => {
              const [file] = input.files;
              const [{size:filesize, name:filename}, headers, params] = [
                      file, new Headers(), new URLSearchParams()
                    ];
              // set `filename`, `filesize` as search parameters for `stream` URL
              Object.entries({filename, filesize})
              .forEach(([...props]) => params.append.apply(params, props));
              // set header for `POST`
              headers.append(header, filename);
              // reset `progress.value` set `progress.max` to `filesize`
              [progress.value, progress.max] = [0, filesize];
              const [request, source] = [
                new Request(url, {
                      method:"POST", headers:headers, body:file
                    })
                // https://stackoverflow.com/a/42330433/
              , new EventSource(`${stream}?${params.toString()}`)
              ];
              source.addEventListener("message", (e) => {
                // update `progress` here,
                // call `.close()` when `e.data === filesize` 
                // `progress.value = e.data`, should be this simple
                console.log(e.data, e.lastEventId);
              }, true);
    
              source.addEventListener("open", (e) => {
                console.log("fetch upload progress open");
              }, true);
    
              source.addEventListener("error", (e) => {
                console.error("fetch upload progress error");
              }, true);
              // sanity check for tests, 
              // we don't need `source` when `e.data === filesize`;
              // we could call `.close()` within `message` event handler
              setTimeout(() => source.close(), 30000);
              // we don't need `source' to be in `Promise` chain, 
              // though we could resolve if `e.data === filesize`
              // before `response`, then wait for `.text()`; etc.
              // TODO: if and where to merge or branch `EventSource`,
              // `fetch` to single or two `Promise` chains
              const upload = fetch(request);
              upload
              .then(response => response.text())
              .then(res => console.log(res))
              .catch(err => console.error(err));
            }
    ];
    
    input.addEventListener("change", handleFile, true);
    </script>
    </body>
    </html>
    

    解决方案

    You need to clearstatcache to get real file size. With few other bits fixed, your stream.php may look like following:

    <?php
    
    header("Content-Type: text/event-stream");
    header("Cache-Control: no-cache");
    header("Connection: keep-alive");
    // Check if the header's been sent to avoid `PHP Notice:  Undefined index: HTTP_LAST_EVENT_ID in stream.php on line `
    // php 7+
    //$lastId = $_SERVER["HTTP_LAST_EVENT_ID"] ?? 0;
    // php < 7
    $lastId = isset($_SERVER["HTTP_LAST_EVENT_ID"]) ? intval($_SERVER["HTTP_LAST_EVENT_ID"]) : 0;
    
    $upload = $_GET["filename"];
    $data = 0;
    // if file already exists, its initial size can be bigger than the new one, so we need to ignore it
    $wasLess = $lastId != 0;
    while ($data < $_GET["filesize"] || !$wasLess) {
        // system calls are expensive and are being cached with assumption that in most cases file stats do not change often
        // so we clear cache to get most up to date data
        clearstatcache(true, $upload);
        $data = filesize($upload);
        $wasLess |= $data <  $_GET["filesize"];
        // don't send stale filesize
        if ($wasLess) {
            sendMessage($lastId, $data);
            $lastId++;
        }
        // not necessary here, though without thousands of `message` events will be dispatched
        //sleep(1);
        // millions on poor connection and large files. 1 second might be too much, but 50 messages a second must be okay
        usleep(20000);
    }
    
    function sendMessage($id, $data)
    {
        echo "id: $id\n";
        echo "data: $data\n\n";
        ob_flush();
        // no need to flush(). It adds content length of the chunk to the stream
        // flush();
    }
    

    Few caveats:

    Security. I mean luck of it. As I understand it is a proof of concept, and security is the least of concerns, yet the disclaimer should be there. This approach is fundamentally flawed, and should be used only if you don't care of DOS attacks or information about your files goes out.

    CPU. Without usleep the script will consume 100% of a single core. With long sleep you are at risk of uploading the whole file within a single iteration and the exit condition will be never met. If you are testing it locally, the usleep should be removed completely, since it is matter of milliseconds to upload MBs locally.

    Open connections. Both apache and nginx/fpm have finite number of php processes that can serve the requests. A single file upload will takes 2 for the time required to upload the file. With slow bandwidth or forged requests, this time can be quite long, and the web server may start to reject requests.

    Clientside part. You need to analyse the response and finally stop listening to the events when the file is fully uploaded.

    EDIT:

    To make it more or less production friendly, you will need an in-memory storage like redis, or memcache to store file metadata.

    Making a post request, add a unique token which identify the file, and the file size.

    In your javascript:

    const fileId = Math.random().toString(36).substr(2); // or anything more unique
    ...
    
    const [request, source] = [
        new Request(`${url}?fileId=${fileId}&size=${filesize}`, {
            method:"POST", headers:headers, body:file
        })
        , new EventSource(`${stream}?fileId=${fileId}`)
    ];
    ....
    

    In data.php register the token and report progress by chunks:

    ....
    
    $fileId = $_GET['fileId'];
    $fileSize = $_GET['size'];
    
    setUnique($fileId, 0, $fileSize);
    
    while ($uploaded = stream_copy_to_stream($input, $file, 1024)) {
        updateProgress($id, $uploaded);
    }
    ....
    
    
    /**
     * Check if Id is unique, and store processed as 0, and full_size as $size 
     * Set reasonable TTL for the key, e.g. 1hr 
     *
     * @param string $id
     * @param int $size
     * @throws Exception if id is not unique
     */
    function setUnique($id, $size) {
        // implement with your storage of choice
    }
    
    /**
     * Updates uploaded size for the given file
     *
     * @param string $id
     * @param int $processed
     */
    function updateProgress($id, $processed) {
        // implement with your storage of choice
    }
    

    So your stream.php don't need to hit the disk at all, and can sleep as long as it is acceptable by UX:

    ....
    list($progress, $size) = getProgress('non_existing_key_to_init_default_values');
    $lastId = 0;
    
    while ($progress < $size) {
        list($progress, $size) = getProgress($_GET["fileId"]);
        sendMessage($lastId, $progress);
        $lastId++;
        sleep(1);
    }
    .....
    
    
    /**
     * Get progress of the file upload.
     * If id is not there yet, returns [0, PHP_INT_MAX]
     *
     * @param $id
     * @return array $bytesUploaded, $fileSize
     */
    function getProgress($id) {
        // implement with your storage of choice
    }
    

    The problem with 2 open connections cannot be solved unless you give up EventSource for old good pulling. Response time of stream.php without loop is a matter of milliseconds, and it is quite wasteful to keep the connection open all the time, unless you need hundreds updates a second.

    这篇关于如何在服务器端实时读取和回显正在上传的文件的文件大小,而不会在服务器端和客户端都被阻塞?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆