如何在网页上制作进度栏以进行 pandas 操作 [英] How to make a progress bar on a web page for pandas operation

查看:88
本文介绍了如何在网页上制作进度栏以进行 pandas 操作的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我已经搜索了一段时间,无法找到一种方法来做到这一点.我有一个简单的Flask应用程序,它将一个CSV文件,将其读入Pandas数据框,将其转换并输出为新的CSV文件.我已成功上传并成功使用HTML进行了转换

I have been googling for a while and couldn't figure out a way to do this. I have a simple Flask app which takes a CSV file, reads it into a Pandas dataframe, converts it and output as a new CSV file. I have managed to upload and convert it successfully with HTML

<div class="container">
  <form method="POST" action="/convert" enctype="multipart/form-data">
    <div class="form-group">
      <br />
      <input type="file" name="file">
      <input type="submit" name="upload"/>
    </div>
  </form>
</div>

点击提交后,它将在后台运行转换一段时间,并在完成后自动触发下载.接受result_df并触发下载的代码如下:

where after I click submit, it runs the conversion in the background for a while and automatically triggers a download once it's done. The code that takes the result_df and triggers download looks like

@app.route('/convert', methods=["POST"])
def convert(
  if request.method == 'POST':
    # Read uploaded file to df
    input_csv_f = request.files['file']
    input_df = pd.read_csv(input_csv_f)
    # TODO: Add progress bar for pd_convert
    result_df = pd_convert(input_df)
    if result_df is not None:
      resp = make_response(result_df.to_csv())
      resp.headers["Content-Disposition"] = "attachment; filename=export.csv"
      resp.headers["Content-Type"] = "text/csv"
      return resp

我想在pd_convert上添加一个进度条,这实际上是一个熊猫应用操作.我发现tqdm现在可用于熊猫,并且它具有progress_apply方法而不是apply.但是我不确定是否与在网页上制作进度条有关.我猜应该是因为它可以在Jupyter笔记本上使用.如何在此处为pd_convert()添加进度栏?

I'd like to add a progress bar to pd_convert which is essentially a pandas apply operation. I found that tqdm works with pandas now and it has a progress_apply method instead of apply. But I'm not sure if it is relevant for making a progress bar on a web page. I guess it should be since it works on Jupyter notebooks. How do I add a progress bar for pd_convert() here?

我想要的最终结果是:

  1. 用户点击上传,从其文件系统中选择CSV文件
  2. 用户点击提交
  3. 进度条开始运行
  4. 进度条达到100%后,就会触发下载

1和2现在完成.接下来的问题是如何触发下载.现在,我的convert函数可以毫无问题地触发下载,因为响应是由文件形成的.如果要渲染页面,请使用return render_template(...)形成响应.由于我只能有一个响应,因此仅通过一次调用/convert就能有3和4吗?

1 and 2 are done now. Then the next question is how to trigger the download. For now, my convert function triggers the download with no problem because the response is formed with a file. If I want to render the page I form a response with return render_template(...). Since I can only have one response, is it possible to have 3 and 4 with only one call to /convert?

不是Web开发人员,仍在学习基础知识.预先感谢!

Not a web developer, still learning about the basics. Thanks in advance!

==== EDIT ====

====EDIT====

我尝试了此处的示例,并进行了一些修改.我从数据帧上for循环中的行索引中获取进度,并将其放入Redis中.客户端通过询问此新端点/progress从流中获得Redis的进度.像

I tried the example here with some modifications. I get the progress from the row index in a for loop on the dataframe and put it in Redis. The client gets the progress from Redis from the stream by asking this new endpoint /progress. Something like

@app.route('/progress')
def progress():
  """Get percentage progress for the dataframe process"""
  r = redis.StrictRedis(
    host=redis_host, port=redis_port, password=redis_password, decode_responses=True)
  r.set("progress", str(0))
  # TODO: Problem, 2nd submit doesn't clear progress to 0%. How to make independent progress for each client and clear to 0% on each submit
  def get_progress():

    p = int(r.get("progress"))
    while p <= 100:
      p = int(r.get("progress"))
      p_msg = "data:" + str(p) + "\n\n"
      yield p_msg
      logging.info(p_msg)
      if p == 100:
        r.set("progress", str(0))
      time.sleep(1)

  return Response(get_progress(), mimetype='text/event-stream')

它目前正在运行,但存在一些问题.原因肯定是我对这种解决方案缺乏了解.

It is currently working but with some issues. The reason is definitely my lack of understanding in this solution.

问题:

  • 我需要每次按submit按钮将进度重置为0.我尝试了几个地方将其重置为0,但尚未找到有效的版本.这与我对流的工作原理缺乏了解有关.现在,只有在刷新页面时它才会重置.
  • 如何处理并发请求(也称为Redis竞争条件)?如果多个用户同时发出请求,则每个用户的进度都应独立.我正在考虑为每个submit事件提供一个随机的job_id并将其作为Redis的键.由于完成每项工作后都不需要该条目,因此我将在完成后删除该条目.
  • I need the progress to be reset to 0 every time submit button is pressed. I tried several places to reset it to 0 but haven't found the working version yet. It's definitely related to my lack of understanding in how stream works. Now it only resets when I refresh the page.
  • How to handle concurrent requests aka the Redis race condition? If multiple users make requests at the same time, the progress should be independent for each of them. I'm thinking about giving a random job_id for each submit event and make it the key in Redis. Since I don't need the entry after each job is done, I will just delete the entry after it's done.

我觉得我缺少的部分是对text/event-stream的理解.感觉我已经接近可行的解决方案.请分享您对执行此操作的正确"方法的看法.我只是在猜测并试图将一些我非常有限的理解适用的东西组合在一起.

I feel my missing part is the understanding of text/event-stream. Feeling I'm close to a working solution. Please share your opinion on what is the "proper" way to do this. I'm just guessing and trying to put together something that works with my very limited understanding.

推荐答案

好的,我缩小了我所缺少的问题,并弄清了.我需要的概念包括

OK, I narrowed down the problems I was missing and figured it out. The concepts I needed include

后端

  • Redis作为键值数据库来存储进度,该进度可以由端点/progress查询以获取事件流(HTML5)
  • 服务器发送事件(SSE),用于流式传输进度:text/event-stream MIME类型响应
  • Flask应用程序中用于SSE的Python生成器
  • 将熊猫数据帧上for循环的进度(正在处理行索引)写入Redis
  • Redis as a key-value database to store the progress which can be queried by endpoint /progress for an event stream (HTML5)
  • Server-Sent Event (SSE) for streaming the progress: text/event-stream MIME type response
  • Python generator in Flask app for SSE
  • Write progress (row index being processed) of a for loop on the Pandas dataframe to Redis

前端

  • 打开事件流:通过 HTML按钮
  • 从客户端触发SSE
  • 关闭事件流:一旦事件数据达到100%
  • 使用jQuery动态更新事件流的进度条
  • Open the event stream: trigger SSE from the client side by an HTML button
  • Close the event stream: once the event data reaches 100%
  • Update the progress bar with the event stream dynamically using jQuery

示例HTML

  <script>
  function getProgress() {
    var source = new EventSource("/progress");
    source.onmessage = function(event) {
      $('.progress-bar').css('width', event.data+'%').attr('aria-valuenow', event.data);
      $('.progress-bar-label').text(event.data+'%');

      // Event source closed after hitting 100%
      if(event.data == 100){
        source.close()
      }
    }
  }
  </script>

  <body>
    <div class="container">
      ...
      <form method="POST" action="/autoattr" enctype="multipart/form-data">
        <div class="form-group">
        ...
          <input type="file" name="file">
          <input type="submit" name="upload" onclick="getProgress()" />
        </div>
      </form>

      <div class="progress" style="width: 80%; margin: 50px;">
        <div class="progress-bar progress-bar-striped active"
          role="progressbar" aria-valuenow="0" aria-valuemin="0" aria-valuemax="100" style="width: 0%">
          <span class="progress-bar-label">0%</span>
        </div>
      </div>
    </div>
  </body>

示例后端烧瓶代码

redis_host = "localhost"
redis_port = 6379
redis_password = ""
r = redis.StrictRedis(
  host=redis_host, port=redis_port, password=redis_password, decode_responses=True)

@app.route('/progress')
def progress():
  """Get percentage progress for auto attribute process"""
  r.set("progress", str(0))
  def progress_stream():
    p = int(r.get("progress"))
    while p < 100:
      p = int(r.get("progress"))
      p_msg = "data:" + str(p) + "\n\n"
      yield p_msg
      # Client closes EventSource on 100%, gets reopened when `submit` is pressed
      if p == 100:
        r.set("progress", str(0))
      time.sleep(1)

  return Response(progress_stream(), mimetype='text/event-stream')

其余是熊猫的代码,用于循环写入Redis.

The rest is the code for Pandas for loop writing to Redis.

我拼凑了数小时的Google搜索结果,因此我觉得最好在这里记录文档,也适合那些需要此基本功能的用户:在Flask Web应用程序中添加进度条以处理Pandas数据框.

I pieced together a lot of the results from hours of Googling so I feel it's best to document here for people who also need this basic feature: add a progress bar in a Flask web app for Pandas dataframe processing.

一些有用的参考

https://medium.com /code-zen/python-generator-and-html-server-sent-events-3cdf14140e56

https://codeburst.io/polling-vs-sse-vs-websocket-如何选择正确的一个1859e4e13bd9

是什么长轮询,Websocket,服务器发送事件(SSE)和彗星?

这篇关于如何在网页上制作进度栏以进行 pandas 操作的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆