使用 Python 'while' 循环设置运行时间限制 [英] Setting a limit for running time with a Python 'while' loop

查看:31
本文介绍了使用 Python 'while' 循环设置运行时间限制的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一些关于在 Python 中设置最大运行时间的问题.事实上,我想使用 pdfminer 将 PDF 文件转换为 .txt.问题是很多时候,有些文件无法解码并且需要很长时间.所以我想设置 time.time() 来限制每个文件的转换时间为 20 秒.另外我是在windows下运行所以不能使用信号功能.

I have some questions related to setting the maximum running time in Python. In fact, I would like to use pdfminer to convert the PDF files to .txt. The problem is that very often, some files are not possible to decode and take an extremely long time. So I want to set time.time() to limit the conversion time for each file to 20 seconds. In addition, I run under Windows so I cannot use signal function.

我使用 pdfminer.convert_pdf_to_txt() 成功运行了转换代码(在我的代码中它是c"),但我无法集成 time.time() 在 while 循环中.在我看来,在下面的代码中,while 循环和 time.time() 不起作用.

I succeeded in running the conversion code with pdfminer.convert_pdf_to_txt() (in my code it is "c"), but I could not integrate the time.time() in the while loop. It seems to me that in the following code, the while loop and time.time() do not work.

总而言之,我想:

  1. 将 PDf 文件转换为 .txt 文件

  1. Convert the PDf file to a .txt file

每次转换的时间限制为 20 秒.如果超时,则抛出异常并保存一个空文件

The time limit for each conversion is 20 seconds. If it runs out of time, throw an exception and save an empty file

将所有txt文件保存在同一个文件夹下

Save all the txt files under the same folder

如果有任何异常/错误,仍然保存文件,但内容为空.

If there are any exceptions/errors, still save the file, but with empty content.

这是当前代码:

import converter as c
import os
import timeit
import time

yourpath = 'D:/hh/'

for root, dirs, files in os.walk(yourpath, topdown=False):

    for name in files:

        t_end = time.time() + 20

        try:
            while time.time() < t_end:

                c.convert_pdf_to_txt(os.path.join(root, name))

                t = os.path.split(os.path.dirname(os.path.join(root, name)))[1]
                a = str(os.path.split(os.path.dirname(os.path.join(root, name)))[0])

                g = str(a.split("\\")[1])
                with open("D:/f/" + g + "&" + t + "&" + name + ".txt", mode="w") as newfile:
                    newfile.write(c.convert_pdf_to_txt(os.path.join(root, name)))
                    print "yes"

            if time.time() > t_end:

                print "no"

                with open("D:/f/" + g + "&" + t + "&" + name + ".txt", mode="w") as newfile:
                    newfile.write("")

        except KeyboardInterrupt:
           raise

        except:
            for name in files:
                t = os.path.split(os.path.dirname(os.path.join(root, name)))[1]
                a = str(os.path.split(os.path.dirname(os.path.join(root, name)))[0])

                g = str(a.split("\\")[1])
                with open("D:/f/" + g + "&" + t + "&" + name + ".txt", mode="w") as newfile:
                    newfile.write("")

推荐答案

你的方法有误.

您定义结束时间并如果当前时间戳低于结束时间戳,则立即进入 while 循环(将始终为 True).所以进入 while 循环,你会卡在转换函数上.

You define the end time and immediately enter the while loop if the current timestamp is lower than the end timestamp (will be always True). So the while loop is entered and you get stuck at the converting function.

我建议使用 signal 模块,它已经包含在 Python 中.它允许您在 n 秒后退出函数.在this Stack Overflow answer中可以看到一个基本示例.

I would suggest the signal module, which is already included in Python. It allows you to quit a function after n seconds. A basic example can be seen in this Stack Overflow answer.

你的代码应该是这样的:

Your code would be like this:

return astring
import converter as c
import os
import timeit
import time
import threading
import thread

yourpath = 'D:/hh/'

for root, dirs, files in os.walk(yourpath, topdown=False):
    for name in files:
        try:
            timer = threading.Timer(5.0, thread.interrupt_main)
            try:
                c.convert_pdf_to_txt(os.path.join(root, name))
            except KeyboardInterrupt:
                 print("no")

                 with open("D:/f/" + g + "&" + t + "&" + name + ".txt", mode="w") as newfile:
                     newfile.write("")
            else:
                timer.cancel()
                t = os.path.split(os.path.dirname(os.path.join(root, name)))[1]
                a = str(os.path.split(os.path.dirname(os.path.join(root, name)))[0])
                g = str(a.split("\\")[1])

                print("yes")

                with open("D:/f/" + g + "&" + t + "&" + name + ".txt", mode="w") as newfile:
                    newfile.write(c.convert_pdf_to_txt(os.path.join(root, name)))

        except KeyboardInterrupt:
           raise

        except:
            for name in files:
                t = os.path.split(os.path.dirname(os.path.join(root, name)))[1]
                a = str(os.path.split(os.path.dirname(os.path.join(root, name)))[0])

                g = str(a.split("\\")[1])
                with open("D:/f/"+g+"&"+t+"&"+name+".txt", mode="w") as newfile:
                    newfile.write("")


只为未来:四个空格缩进,不要太多空格;)

这篇关于使用 Python 'while' 循环设置运行时间限制的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆