django,fastcgi:如何管理长时间运行的进程? [英] django,fastcgi: how to manage a long running process?

查看:131
本文介绍了django,fastcgi:如何管理长时间运行的进程?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我继承了需要修改的django + fastcgi应用程序,以执行冗长的计算(长达半小时或更长时间)。我想做的是在后台运行计算,并返回你的作业已经开始 - 类型响应。当进程正在运行时,URL的进一步命中应该返回你的作业仍在运行,直到作业完成,此时应该返回作业的结果。任何后续的url的命中应该返回缓存的结果。



我是django的一个新手,在十年内没有做任何重要的网络作品,所以我不知道是否有内置的方式我想要的是。我已经尝试通过subprocess.Popen()启动该进程,并且工作正常,除了在进程表中留下一个已停止的条目。我需要一个干净的解决方案,可以删除临时文件和任何痕迹的进程一旦完成。



我还尝试了fork()和线程,还有想出一个可行的解决方案。有没有一个规范的解决方案,似乎在我看来是一个很常见的用例? FWIW只能在流量非常低的内部服务器上使用。

解决方案

我现在必须解决类似的问题。它不会是一个公共网站,但同样的,内部服务器的流量很低。



技术限制:




  • 所有输入数据到长时间运行的进程都可以在其开始时提供

  • 长时间运行的进程不需要用户交互(除了初始输入开始一个过程)

  • 计算时间足够长,以至于无法以直接的HTTP响应向客户端提供结果

  • 某些



因此,我们至少需要两个网页视图:一个启动长时间运行的过程,另一个是监视其状态/收集结果。



我们还需要某种进程间通信:发送用户数据从启动器(http请求上的Web服务器)到长时间运行的进程,然后将其结果发送到接收器

我看到两种可能的解决方案:




  • 将长时间运行的进程调度到长时间运行的作业管理器(这可能是上述django-queue-service所在);

  • 将结果永久保存在文件或DB中。



我更喜欢使用临时文件并记住他们在会话数据中的位置。我不认为它可以变得更简单。



作业脚本(这是长时间运行的过程), myjob.py

 导入sys 
从时间导入睡眠

i = 0
,而i< 1000:
打印'myjob:',i
i = i + 1
sleep(0.1)
sys.stdout.flush()

django urls.py 映射:

  urlpatterns = patterns('',
(r'^ startjob / $','mysite.myapp.views.startjob'),
(r' ^ showjob / $','mysite.myapp.views.showjob'),
(r'^ rmjob / $','mysite.myapp.views.rmjob'),

django意见:

 从tempfile导入mkstemp 
从os import fdopen,unlink,kill
从子进程导入Popen
导入信号

def startjob(request):
开始一个新的长时间运行的过程,除非已经开始。
如果没有request.session.has_key('job'):
#创建一个临时文件来保存资源
outfd,outname = mkstemp()
request.session ['jobfile'] = outname
outfile = fdopen(outfd,'a +')
proc = Popen (python myjob.py,shell = True,stdout = outfile)
#记得pid以后终止作业
request.session ['job'] = proc.pid
return HttpResponse ('A< a href =/ showjob />新作业< / a>已经开始了''

def showjob(request):
显示正在运行的作业的最后一个结果。
如果没有request.session.has_key 'job'):
return HttpResponse('not running a job。'+ \
'< a href =/ startjob />开始一个新的?< / a> )
else:
filename = request.session ['jobfile']
results = open(filename)
lines = results.readlines()
try:
return HttpResponse(lines [-1] + \
'< p>< a href =/ rmjob />终止?< / a>')
除了:
return HttpResponse('No results yet。'+ \
'< p>< a href =/ rmjob /> Terminate?< / a>')
return响应

def rmjob(request):
终止运行作业。
如果request.session.has_key('job'):
作业=的request.session ['工作']
filename = request.session ['jobfile']
try:
kill(job,signal.SIGKILL)#unix only
unlink(filename)
except OSError,e:
pass#可能工作已经完成
del request.session ['job']
del request.session ['jobfile']
return HttpResponseRedirect(' / startjob /')#开始一个新的


I have inherited a django+fastcgi application which needs to be modified to perform a lengthy computation (up to half an hour or more). What I want to do is run the computation in the background and return a "your job has been started" -type response. While the process is running, further hits to the url should return "your job is still running" until the job finishes at which point the results of the job should be returned. Any subsequent hit on the url should return the cached result.

I'm an utter novice at django and haven't done any significant web work in a decade so I don't know if there's a built-in way to do what I want. I've tried starting the process via subprocess.Popen(), and that works fine except for the fact it leaves a defunct entry in the process table. I need a clean solution that can remove temporary files and any traces of the process once it has finished.

I've also experimented with fork() and threads and have yet to come up with a viable solution. Is there a canonical solution to what seems to me to be a pretty common use case? FWIW this will only be used on an internal server with very low traffic.

解决方案

I have to solve a similar problem now. It is not going to be a public site, but similarly, an internal server with low traffic.

Technical constraints:

  • all input data to the long running process can be supplied on its start
  • long running process does not require user interaction (except for the initial input to start a process)
  • the time of the computation is long enough so that the results cannot be served to the client in an immediate HTTP response
  • some sort of feedback (sort of progress bar) from the long running process is required.

Hence, we need at least two web "views": one to initiate the long running process, and the other, to monitor its status/collect the results.

We also need some sort of interprocess communication: send user data from the initiator (the web server on http request) to the long running process, and then send its results to the reciever (again web server, driven by http requests). The former is easy, the latter is less obvious. Unlike in normal unix programming, the receiver is not known initially. The receiver may be a different process from the initiator, and it may start when the long running job is still in progress or is already finished. So the pipes do not work and we need some permamence of the results of the long running process.

I see two possible solutions:

  • dispatch launches of the long running processes to the long running job manager (this is probably what the above-mentioned django-queue-service is);
  • save the results permanently, either in a file or in DB.

I preferred to use temporary files and to remember their locaiton in the session data. I don't think it can be made more simple.

A job script (this is the long running process), myjob.py:

import sys
from time import sleep

i = 0
while i < 1000:
    print 'myjob:', i  
    i=i+1
    sleep(0.1)
    sys.stdout.flush()

django urls.py mapping:

urlpatterns = patterns('',
(r'^startjob/$', 'mysite.myapp.views.startjob'),
(r'^showjob/$',  'mysite.myapp.views.showjob'),
(r'^rmjob/$',    'mysite.myapp.views.rmjob'),
)

django views:

from tempfile import mkstemp
from os import fdopen,unlink,kill
from subprocess import Popen
import signal

def startjob(request):
     """Start a new long running process unless already started."""
     if not request.session.has_key('job'):
          # create a temporary file to save the resuls
          outfd,outname=mkstemp()
          request.session['jobfile']=outname
          outfile=fdopen(outfd,'a+')
          proc=Popen("python myjob.py",shell=True,stdout=outfile)
          # remember pid to terminate the job later
          request.session['job']=proc.pid
     return HttpResponse('A <a href="/showjob/">new job</a> has started.')

def showjob(request):
     """Show the last result of the running job."""
     if not request.session.has_key('job'):
          return HttpResponse('Not running a job.'+\
               '<a href="/startjob/">Start a new one?</a>')
     else:
          filename=request.session['jobfile']
          results=open(filename)
          lines=results.readlines()
          try:
               return HttpResponse(lines[-1]+\
                         '<p><a href="/rmjob/">Terminate?</a>')
          except:
               return HttpResponse('No results yet.'+\
                         '<p><a href="/rmjob/">Terminate?</a>')
     return response

def rmjob(request):
     """Terminate the runining job."""
     if request.session.has_key('job'):
          job=request.session['job']
          filename=request.session['jobfile']
          try:
               kill(job,signal.SIGKILL) # unix only
               unlink(filename)
          except OSError, e:
               pass # probably the job has finished already
          del request.session['job']
          del request.session['jobfile']
     return HttpResponseRedirect('/startjob/') # start a new one

这篇关于django,fastcgi:如何管理长时间运行的进程?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆