如何在不阻塞服务器和客户端的情况下实时读取和回显在服务器上写入的上传文件的文件大小? [英] How to read and echo file size of uploaded file being written at server in real time without blocking at both server and client?

查看:12
本文介绍了如何在不阻塞服务器和客户端的情况下实时读取和回显在服务器上写入的上传文件的文件大小?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

问题:

如何在服务器端和客户端不阻塞的情况下,实时读取并回显正在服务器端写入的上传文件的文件大小?

How to read and echo file size of uploaded file being written at server in real time without blocking at both server and client?

上下文:

fetch()发出的POST请求写入服务器的文件上传进度,其中body设置为BlobFileTypedArrayArrayBuffer 对象.

Progress of file upload being written to server from POST request made by fetch(), where body is set to Blob, File, TypedArray, or ArrayBuffer object.

当前实现将File 对象设置在body 对象传递给fetch() 的第二个参数.

The current implementation sets File object at body object passed to second parameter of fetch().

要求:

读取并echotext/event-stream 的形式向客户端发送写入服务器文件系统的文件大小.当在 GET 请求中作为查询字符串参数提供给脚本的所有字节都已写入时停止.文件的读取当前发生在单独的脚本环境中,其中 GET 调用应该读取文件的脚本是在 POST 到将文件写入服务器的脚本之后进行的.

Read and echo to client the file size of file being written to filesystem at server as text/event-stream. Stop when all of the bytes, provided as a variable to the script as a query string parameter at GET request have been written. The read of the file currently takes place at a separate script environment, where GET call to script which should read file is made following POST to script which writes file to server.

尚未达到对将文件写入服务器或读取文件以获取当前文件大小的潜在问题的错误处理,尽管一旦文件大小部分的 echo 完成,这将是下一步.

Have not reached error handling of potential issue with write of file to server or read of file to get current file size, though that would be next step once echo of file size portion is completed.

目前正在尝试使用 php 来满足要求.虽然也对cbashnodejspython感兴趣;或可用于执行相同任务的其他语言或方法.

Presently attempting to meet requirement using php. Though also interested in c, bash, nodejs, python; or other languages or approaches which can be used to perform same task.

客户端 javascript 部分不是问题.只是不太精通 php(万维网最常用的服务器端语言之一)来实现该模式而不包含不必要的部分.

The client side javascript portion is not an issue. Simply not that versed in php, one of the most common server-side languages used at world wide web, to implement the pattern without including parts which are not necessary.

动机:

获取进度指示器?

相关:

使用 ReadableStream 获取

问题:

获取

PHP Notice:  Undefined index: HTTP_LAST_EVENT_ID in stream.php on line 7

终端.

另外,如果替换

while(file_exists($_GET["filename"]) 
  && filesize($_GET["filename"]) < intval($_GET["filesize"]))

为了

while(true)

EventSource 处产生错误.

没有 sleep() 调用,正确的文件大小被分派到 message 事件,用于 3.3MB 文件,3321824,分别在console 619212621438093次打印,上传同一个文件3次次.预期结果是文件的文件大小,因为文件正在

Without sleep() call, correct file size was dispatched to message event for a 3.3MB file, 3321824, was printed at console 61921, 26214, and 38093 times, respectively, when uploaded same file three times. The expected result is file size of file as the file is being written at

stream_copy_to_stream($input, $file);

而不是上传文件对象的文件大小.fopen()stream_copy_to_stream() 是否阻塞了 stream.php 中的其他不同的 php 进程?

instead of file size of uploaded file object. Are fopen() or stream_copy_to_stream() blocking as to other a different php process at stream.php?

到目前为止尝试过:

php 归因于

php

// can we merge `data.php`, `stream.php` to same file?
// can we use `STREAM_NOTIFY_PROGRESS` 
// "Indicates current progress of the stream transfer 
// in bytes_transferred and possibly bytes_max as well" to read bytes?
// do we need to call `stream_set_blocking` to `false`
// data.php
<?php

  $filename = $_SERVER["HTTP_X_FILENAME"];
  $input = fopen("php://input", "rb");
  $file = fopen($filename, "wb"); 
  stream_copy_to_stream($input, $file);
  fclose($input);
  fclose($file);
  echo "upload of " . $filename . " successful";

?>

// stream.php
<?php

  header("Content-Type: text/event-stream");
  header("Cache-Control: no-cache");
  header("Connection: keep-alive");
  // `PHP Notice:  Undefined index: HTTP_LAST_EVENT_ID in stream.php on line 7` ?
  $lastId = $_SERVER["HTTP_LAST_EVENT_ID"] || 0;
  if (isset($lastId) && !empty($lastId) && is_numeric($lastId)) {
      $lastId = intval($lastId);
      $lastId++;
  }
  // else {
  //  $lastId = 0;
  // }

  // while current file size read is less than or equal to 
  // `$_GET["filesize"]` of `$_GET["filename"]`
  // how to loop only when above is `true`
  while (true) {
    $upload = $_GET["filename"];
    // is this the correct function and variable to use
    // to get written bytes of `stream_copy_to_stream($input, $file);`?
    $data = filesize($upload);
    // $data = $_GET["filename"] . " " . $_GET["filesize"];
    if ($data) {
      sendMessage($lastId, $data);
      $lastId++;
    } 
    // else {
    //   close stream 
    // }
    // not necessary here, though without thousands of `message` events
    // will be dispatched
    // sleep(1);
    }

    function sendMessage($id, $data) {
      echo "id: $id
";
      echo "data: $data

";
      ob_flush();
      flush();
    }
?>

javascript

<!DOCTYPE html>
<html>
<head>
</head>
<body>
<input type="file">
<progress value="0" max="0" step="1"></progress>
<script>

const [url, stream, header] = ["data.php", "stream.php", "x-filename"];

const [input, progress, handleFile] = [
        document.querySelector("input[type=file]")
      , document.querySelector("progress")
      , (event) => {
          const [file] = input.files;
          const [{size:filesize, name:filename}, headers, params] = [
                  file, new Headers(), new URLSearchParams()
                ];
          // set `filename`, `filesize` as search parameters for `stream` URL
          Object.entries({filename, filesize})
          .forEach(([...props]) => params.append.apply(params, props));
          // set header for `POST`
          headers.append(header, filename);
          // reset `progress.value` set `progress.max` to `filesize`
          [progress.value, progress.max] = [0, filesize];
          const [request, source] = [
            new Request(url, {
                  method:"POST", headers:headers, body:file
                })
            // https://stackoverflow.com/a/42330433/
          , new EventSource(`${stream}?${params.toString()}`)
          ];
          source.addEventListener("message", (e) => {
            // update `progress` here,
            // call `.close()` when `e.data === filesize` 
            // `progress.value = e.data`, should be this simple
            console.log(e.data, e.lastEventId);
          }, true);

          source.addEventListener("open", (e) => {
            console.log("fetch upload progress open");
          }, true);

          source.addEventListener("error", (e) => {
            console.error("fetch upload progress error");
          }, true);
          // sanity check for tests, 
          // we don't need `source` when `e.data === filesize`;
          // we could call `.close()` within `message` event handler
          setTimeout(() => source.close(), 30000);
          // we don't need `source' to be in `Promise` chain, 
          // though we could resolve if `e.data === filesize`
          // before `response`, then wait for `.text()`; etc.
          // TODO: if and where to merge or branch `EventSource`,
          // `fetch` to single or two `Promise` chains
          const upload = fetch(request);
          upload
          .then(response => response.text())
          .then(res => console.log(res))
          .catch(err => console.error(err));
        }
];

input.addEventListener("change", handleFile, true);
</script>
</body>
</html>

推荐答案

您需要clearstatcache 获取实际文件大小.修复了一些其他位后,您的 stream.php 可能如下所示:

You need to clearstatcache to get real file size. With few other bits fixed, your stream.php may look like following:

<?php

header("Content-Type: text/event-stream");
header("Cache-Control: no-cache");
header("Connection: keep-alive");
// Check if the header's been sent to avoid `PHP Notice:  Undefined index: HTTP_LAST_EVENT_ID in stream.php on line `
// php 7+
//$lastId = $_SERVER["HTTP_LAST_EVENT_ID"] ?? 0;
// php < 7
$lastId = isset($_SERVER["HTTP_LAST_EVENT_ID"]) ? intval($_SERVER["HTTP_LAST_EVENT_ID"]) : 0;

$upload = $_GET["filename"];
$data = 0;
// if file already exists, its initial size can be bigger than the new one, so we need to ignore it
$wasLess = $lastId != 0;
while ($data < $_GET["filesize"] || !$wasLess) {
    // system calls are expensive and are being cached with assumption that in most cases file stats do not change often
    // so we clear cache to get most up to date data
    clearstatcache(true, $upload);
    $data = filesize($upload);
    $wasLess |= $data <  $_GET["filesize"];
    // don't send stale filesize
    if ($wasLess) {
        sendMessage($lastId, $data);
        $lastId++;
    }
    // not necessary here, though without thousands of `message` events will be dispatched
    //sleep(1);
    // millions on poor connection and large files. 1 second might be too much, but 50 messages a second must be okay
    usleep(20000);
}

function sendMessage($id, $data)
{
    echo "id: $id
";
    echo "data: $data

";
    ob_flush();
    // no need to flush(). It adds content length of the chunk to the stream
    // flush();
}

注意事项:

安全.我的意思是它的运气.据我所知,这是一个概念证明,安全性是最不关心的问题,但免责声明应该在那里.这种方法从根本上是有缺陷的,只有在您不关心 DOS 攻击或有关您的文件的信息泄露时才应该使用.

Security. I mean luck of it. As I understand it is a proof of concept, and security is the least of concerns, yet the disclaimer should be there. This approach is fundamentally flawed, and should be used only if you don't care of DOS attacks or information about your files goes out.

中央处理器.如果没有 usleep,脚本将消耗 100% 的单个内核.如果长时间休眠,您将面临在一次迭代中上传整个文件的风险,并且永远不会满足退出条件.如果你在本地测试它,usleep 应该被完全删除,因为在本地上传 MB 是几毫秒的事情.

CPU. Without usleep the script will consume 100% of a single core. With long sleep you are at risk of uploading the whole file within a single iteration and the exit condition will be never met. If you are testing it locally, the usleep should be removed completely, since it is matter of milliseconds to upload MBs locally.

打开连接.apache 和 nginx/fpm 都有有限数量的 php 进程可以为请求提供服务.上传文件所需的时间为单个文件上传时间为 2.对于慢速带宽或伪造请求,此时间可能会很长,并且 Web 服务器可能会开始拒绝请求.

Open connections. Both apache and nginx/fpm have finite number of php processes that can serve the requests. A single file upload will takes 2 for the time required to upload the file. With slow bandwidth or forged requests, this time can be quite long, and the web server may start to reject requests.

客户端部分.您需要分析响应并最终在文件完全上传后停止监听事件.

Clientside part. You need to analyse the response and finally stop listening to the events when the file is fully uploaded.

为了使其或多或少对生产友好,您将需要一个内存存储(如 redis 或 memcache)来存储文件元数据.

To make it more or less production friendly, you will need an in-memory storage like redis, or memcache to store file metadata.

发出一个 post 请求,添加一个唯一的令牌来标识文件和文件大小.

Making a post request, add a unique token which identify the file, and the file size.

在您的 javascript 中:

In your javascript:

const fileId = Math.random().toString(36).substr(2); // or anything more unique
...

const [request, source] = [
    new Request(`${url}?fileId=${fileId}&size=${filesize}`, {
        method:"POST", headers:headers, body:file
    })
    , new EventSource(`${stream}?fileId=${fileId}`)
];
....

在 data.php 中注册令牌并按块报告进度:

In data.php register the token and report progress by chunks:

....

$fileId = $_GET['fileId'];
$fileSize = $_GET['size'];

setUnique($fileId, 0, $fileSize);

while ($uploaded = stream_copy_to_stream($input, $file, 1024)) {
    updateProgress($id, $uploaded);
}
....


/**
 * Check if Id is unique, and store processed as 0, and full_size as $size 
 * Set reasonable TTL for the key, e.g. 1hr 
 *
 * @param string $id
 * @param int $size
 * @throws Exception if id is not unique
 */
function setUnique($id, $size) {
    // implement with your storage of choice
}

/**
 * Updates uploaded size for the given file
 *
 * @param string $id
 * @param int $processed
 */
function updateProgress($id, $processed) {
    // implement with your storage of choice
}

因此您的 stream.php 根本不需要访问磁盘,并且只要 UX 可以接受就可以休眠:

So your stream.php don't need to hit the disk at all, and can sleep as long as it is acceptable by UX:

....
list($progress, $size) = getProgress('non_existing_key_to_init_default_values');
$lastId = 0;

while ($progress < $size) {
    list($progress, $size) = getProgress($_GET["fileId"]);
    sendMessage($lastId, $progress);
    $lastId++;
    sleep(1);
}
.....


/**
 * Get progress of the file upload.
 * If id is not there yet, returns [0, PHP_INT_MAX]
 *
 * @param $id
 * @return array $bytesUploaded, $fileSize
 */
function getProgress($id) {
    // implement with your storage of choice
}

2个开放连接的问题无法解决,除非你为了旧的好拉动而放弃EventSource.没有循环的stream.php的响应时间是几毫秒,一直保持连接打开是很浪费的,除非你需要每秒更新数百次.

The problem with 2 open connections cannot be solved unless you give up EventSource for old good pulling. Response time of stream.php without loop is a matter of milliseconds, and it is quite wasteful to keep the connection open all the time, unless you need hundreds updates a second.

这篇关于如何在不阻塞服务器和客户端的情况下实时读取和回显在服务器上写入的上传文件的文件大小?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆