无头脚本在几次运行后崩溃 [英] Headless script crashes after a few runs
问题描述
我有一个脚本使用无头浏览器,我使用 crontab -e
运行。它运行良好的前几次,然后崩溃与以下Traceback:
跟踪(最近最后调用):
文件/home/clint-selenium-firefox.py,第83行,在< module>
driver.get(url)
文件/usr/local/lib/python2.7/dist-packages/selenium/webdriver/remote/webdriver.py,第248行,在get
self.execute(Command.GET,{'url':url})
文件/usr/local/lib/python2.7/dist-packages/selenium/webdriver/remote/webdriver.py,行236,在execute
self.error_handler.check_response(response)
文件/usr/local/lib/python2.7/dist-packages/selenium/webdriver/remote/errorhandler.py,第192行,in check_response
raise exception_class(message,screen,stacktrace)
selenium.common.exceptions.WebDriverException:消息:无法解码来自木马的响应
我的crontab行是:
* / 10 * * * * export DISPLAY =:0&& python /home/clint-selenium-firefox.py>> /home/error.log 2& 1
我不想使用python脚本,所以我已经拉出我认为是相关的位。
从pyvirtualdisplay import显示
pre>
display = display(visible = 0,size =(800,600))
display.start()
...
driver = webdriver.Firefox .get(url)
...
driver.quit()
...
display.stop()
非常感谢您的帮助。
EDIT b
$ b版本:Firefox 49.0.2;硒:3.0.1; geckodriver:geckodriver-v0.11.1-linux64.tar.gz
错误代码(在
):
driver = webdriver.Firefox()
如果DEBUG:打印打开的Firefox
for u in urls:
list_of_rows = []
list_of_old_rows = []
#获取旧版本的站点数据
mycsvfile = u [1]
try:
with open(mycsvfile,'r')as csvfile:
old_data = csv.reader(csvfile,delimiter ='',quotechar ='|')
for o in old_data:
list_of_old_rows.append(o)
except:pass
#获取新数据
url = u [0]
if DEBUG:print url
driver.get(url)
如果DEBUG:print driver.title
time.sleep(1)
page_source = driver。 page_source
soup = bs4.BeautifulSoup(page_source,'html.parser')
方案从多个Firefox实例失败并显示NS_ERROR_SOCKET_ADDRESS_IN_USE#99 这是因为没有--marionette-port选项传递给geckodriver - 这意味着geckodriver的所有实例启动firefox传递相同的所需默认端口(2828)。第一个firefox实例绑定到该端口,未来的实例不能和所有的geckodriver实例最终连接到第一个firefox实例 - 这产生各种不可预测的行为。
其次是:我认为一个合理的短期解决方案是做其他驱动程序正在做的事,并要求Marionette绑定到一个由geckodriver生成的随机,自由端口。目前它使用2828作为其生成的Firefox的所有实例的默认值。
由于Marionette不幸还没有一个带外的方式将端口传回客户端(geckodriver),这本身就是一种诡异,但是我们可以改善未来的情况, 1240830。
此更改
Selenium 3.0.0.b2
*更新了Marionette端口参数匹配其他驱动程序。
我猜随机只有这么久。提出问题。对于您拥有的selenium,firefox和geckodriver版本,可能需要修正代码。你可以回到使用Selenium 2.53.0和firefox esr 38.8,直到这被修复。您的通话。
更新:尝试
来自selenium import webdriver
from selenium.webdriver.firefox.firefox_binary import FirefoxBinary
binary = FirefoxBinary('path / to / binary')
driver = webdriver.Firefox(firefox_binary = binary)
I have a script using a headless browser which I'm running using
crontab -e
. It runs fine the first few times and then crashes with the following Traceback:Traceback (most recent call last): File "/home/clint-selenium-firefox.py", line 83, in <module> driver.get(url) File "/usr/local/lib/python2.7/dist-packages/selenium/webdriver/remote/webdriver.py", line 248, in get self.execute(Command.GET, {'url': url}) File "/usr/local/lib/python2.7/dist-packages/selenium/webdriver/remote/webdriver.py", line 236, in execute self.error_handler.check_response(response) File "/usr/local/lib/python2.7/dist-packages/selenium/webdriver/remote/errorhandler.py", line 192, in check_response raise exception_class(message, screen, stacktrace) selenium.common.exceptions.WebDriverException: Message: Failed to decode response from marionette
My crontab line is:
*/10 * * * * export DISPLAY=:0 && python /home/clint-selenium-firefox.py >> /home/error.log 2>&1
I don't want to overload this with the python script so I've pulled out what I think are the relevant bits.
from pyvirtualdisplay import Display display = Display(visible=0, size=(800, 600)) display.start() ... driver = webdriver.Firefox() driver.get(url) ... driver.quit() ... display.stop()
Your help is much appreciated.
EDIT
Versions: Firefox 49.0.2; Selenium : 3.0.1; geckodriver: geckodriver-v0.11.1-linux64.tar.gz
Code around error (failing on
driver.get(url)
):driver = webdriver.Firefox() if DEBUG: print "Opened Firefox" for u in urls: list_of_rows = [] list_of_old_rows = [] # get the old version of the site data mycsvfile = u[1] try: with open(mycsvfile, 'r') as csvfile: old_data = csv.reader(csvfile, delimiter=' ', quotechar='|') for o in old_data: list_of_old_rows.append(o) except: pass # get the new data url = u[0] if DEBUG: print url driver.get(url) if DEBUG: print driver.title time.sleep(1) page_source = driver.page_source soup = bs4.BeautifulSoup(page_source,'html.parser')
解决方案From Multiple Firefox instances failing with NS_ERROR_SOCKET_ADDRESS_IN_USE #99 This is because no --marionette-port option is passed to geckodriver - which means all instances of geckodriver launch firefox passing the same desired default port (2828). The first firefox instance binds to that port, future instances can't and all the geckodriver instances end up connecting to the first firefox instance - which produces all sorts of unpredictable behavior.
Followed by: I think a reasonable short-term solution is to do what the other drivers are doing and ask Marionette to bind to a randomised, free port generated by geckodriver. Currently it uses 2828 as the default for all instances it spawns of Firefox. Since Marionette unfortunately does not yet have an out-of-band way of communicating the port back to the client (geckodriver), this is inherently racy but we can improve the situation in the future with one of the proposals from bug 1240830.
This change was made in
Selenium 3.0.0.b2 * Updated Marionette port argument to match other drivers.
I guess random only works for so long. Raise an issue. A code fix may be required for the versions of selenium, firefox and geckodriver that you have. You could drop back to using Selenium 2.53.0 and firefox esr 38.8 until this is fixed. Your call.
UPDATE: Try
from selenium import webdriver from selenium.webdriver.firefox.firefox_binary import FirefoxBinary binary = FirefoxBinary('path/to/binary') driver = webdriver.Firefox(firefox_binary=binary)
这篇关于无头脚本在几次运行后崩溃的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!